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Abstract 

We show that the first-order theory of structural subtyping 
of non-recursive types is decidable. 

Let E be a language consisting of function symbols (rep- 
resenting type constructors) and C a decidable structure in 
the relational language L containing a binary relation <. C 
represents primitive types; < represents a subtype ordering. 
We introduce the notion of T.-term-power of C, which gen- 
eralizes the structure arising in structural subtyping. The 
domain of the E-term-power of C is the set of E-terms over 
the set of elements of C. 

We show that the decidability of the first-order theory of 
C implies the decidability of the first-order theory of the E- 
term-power of C. This result implies the decidability of the 
first-order theory of structural subtyping of non-recursive 
types. 

Our decision procedure is based on quantifier elimination 
and makes use of quantifier elimination for term algebras 
and Feferman-Vaught construction for products of decidable 
structures. 

We also explore connections between the theory of struc- 
tural subtyping of recursive types and monadic second-order 
theory of tree- like structures. In particular, we give an em- 
bedding of the monadic second-order theory of infinite bi- 
nary tree into the first-order theory of structural subtyping 
of recursive types. 
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1 Introduction 

Subtyping constraints arc an important technique for check- 
ing and inferring program properties, used both in type sys- 
tems and program analyses [34, 16, 13, 28, 23, 4, 3, 1, 2, 20, 
41, 17, 54, 7, 8, 5, 42, 47, 19]. 

This paper presents a decision procedure for the first- 
order theory of structural subtyping of non-recursive types. 
This result solves (for the case of non-recursive types) a 
problem left open iu [48]. [48] provides the decidability re- 
sult for structural subtyping of only niiary type constructors, 
whereas we solve the problem for any rmmbcr of constructors 
of any arity. Furthermore, we do not impose any constraints 
on the subtyping relation <, it need not even be a partial or- 
der. The generality of our construction makes it potentially 
of independent interest in logic and model theory. 

We approach the problem of structural subtyping using 
quantifier elimination and, to some extent, using monadic 
second-order logic of tree-like structures. This paper makes 
the contributions: 

• we give a new presentation of Feferman-Vaught theo- 
rem for direct products using a multisortcd logic (Sec- 
tion 3.3); for completeness we also include proof of 
quantifier-elimination for boolean algebras of sets (Sec- 
tion 3.2); 

• we give a now presentation of decidability of the first- 
order theory of term algebras; the proof uses the lan- 
guage of both constructor and selector symbols (Sec- 
tion 3.4); 

• as an introduction to main result, we show decidability 
of structural subtyping with one covariant binary con- 
structor and two constants (Section 4), this result does 
not rely on Feferman-Vaught technique; 

• we present a new construction, term-power algebra for 
creating tree-like theories based on existing theories 
(Section 5); 

• as a central result, we prove that if the base theory 
is decidable, so is the theory of term-power with ar- 
bitrary variance of constructors; we give an effective 
decision procedure for quantifier elimination in term- 
power structure; the procedure combines elements of 
quantifier elimination in Feferman-Vaught theorem and 
quantifier elimination in term algebras (Sections 5, 6). 

• we show the decidability of structural subtyping non- 
recursive types as a direct consequence of the main re- 
sult; 

• we give a simple embedding of monadic second-order 
theory of infinite binary tree into the theory of struc- 
tural subtyping of recursive types with two primitive 
types (Section 7.1); 

• we show that structural subtyping of recursive types 
where terms range over constant shapes is decidable 

(Section 7.4); 

In addition to showing the decidability of structural sub- 
typing, our hope is to promote the important technique of 
quantifier elimination, which forms the basis of our result. 

Quantifier elimination [22, Section 2.7] is a fruitful tech- 
nique that was used to show decidability and classification 
of boolean algebras [46, 51] decidability of term algebras 



[31, Chapter 23], [39, 30], with membership constraints [10] 
and with queues [43], decidability of products [35, 14], [31, 
Chapter 12], and algebraically closed fields [50], 

The complexity of the decision problem for the first-order 
theory of structural subtyping has a non-elementary lower 
bound. This is a consequence of a general theorem about 
pairing functions [15, Theorem 1.2, Page 163] and applies to 
term algebras already, as observed in [39, 43). 

2 Preliminaries 

In this section we review some notions used in the this paper. 

If w is a word over some alphabet, we write \w\ for the 
length of w. We write wi • W2 to denote the concatenation 
of words wi and W2 ■ 

A node v in a directed graph is a sink if v has no outgoing 
edges. A node w in a directed graph is a source if v has no 
incoming edges. 

We write Ei = E2 to denote equality of syntactic entities 
El and E2. 

We write x to denote some sequence of variables 

iCl, . . . , Xri' 

We assume that formulas are built from propositional 
connectives A, V, -1, the remaining connectives are defined 
as shorthands. Connective -1 binds the strongest, followed 
by A and V. 

A literal L is an atomic formula ^ or a negation of an 
atomic formula -^A. We define complementation of a literal 
by !A = -lA and ^ = A. 

A formula ip is in prenex form if it is of the form 

QlXl QnXn-<j> 

where Qi G {V, 3} for 1 < i < n and is a quantifier free 
formula. We call a matrix of ip. 

If </> is a formula then FV((/)) denotes the set of free vari- 
ables in 4>. 

We write [xi 1— > ai, . . . , 1-^ a^] for the substitution a 
such that o{xi) — ai for 1 < i < fc. 

If is a formula and ti, . . . ,tk terms, we write <f)[xi := 
ti, . . . ,Xk '■= tk] for the result of sinmltaneously substituting 
free occurrences of variables Xi with term ti, for 1 < i < fc. 

We write h(t) for the height of term t. h(a) = if a is 
a constant, h{x) = if a; is a variable. If f{ti, . . . , ifc) is a 
term then 

h(/(ii, . . . , tk)) = 1 + max(h(ii), . . . , h(ifc)) 

We assume that all function symbols are of finite arity. If 

there are finitely many function symbols then for any non- 
negative integer k there is only a finite number of terms t 
such that h{t) < k. 

If 4>{u) is a conjunction of literals, we say that (f>' results 
from 3u.4i{u) by dropping quantified variable u iff 0' is the 
result of eliminating from 0('u) all conjunctions containing 
u. More generally, if V' is a formula of form 

Qixi ...Qu... QkXk- ■00 

then the result of dropping u from ip is 

Qixi . ..QkXk- tp'o 

where tp'o is the result of dropping u from 3u.<po- 

An equality is an atomic formula ti = t2 where ti and t2 
are terms. A disequality is negation of an equality. 
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We use the usual Tarskian semantics of formulas. Unless 
otherwise stated (p \= '>p will denote that formula (p ^ tp 
is true in a fixed relational structure that is under current 
consideration. 

Occasionally we find it convenient to work with multi- 
sorted logic, where domain is union of disjoint sets called 
sorts, and arity specifies the sorts of all operations. Con- 
stants are operations with zero arguments. Relations are 
operations that return the result in a distinguished sort bool 
interpreted over the boolean lattice {false, true} or over the 
distributive lattice of throe-valued logic {false, true, undef} 
from Section 2.3). 

A structure C of a given language L is a pair of domain 
C and the interpretation function |[_]|'^. Hence, we name op- 
erations of the structure using symbols of the language and 
the interpretation function. If C is clear from the context 
we write simply |_| for |_]*^. 

In Section 3.3 and Section 6 we use logic with several 
kinds of quantifiers. Our logic is first-order, but wo give 
higher-order types to quantifiers. For example, a quantifier 

Q::{A^B)^B 

denotes a quantifier that binds variables of A sort enclosed 
within an expression of B sort and returns an expression of 
B sort. If X and Y are sets then X ^ Y denotes the set of 
all functions from A to B. When specifying the semantics 
of the quantifier Q we specify a function 

m : (M ^ m) ^ m 

The semantics of an expression M of sort B takes an environ- 
ment a which is a function from variable names to elements 
of A and produces an element of B, hence |M]cr e We 
define the semantics of an expression Qx. M by: 

IQx. Mja = IQjh 

where h : {Aj — > |S| is the function 

h{a) = [Mj{a[x := a]) 

Here 

f if y ^ ^ 

a[x := a]{y) = < 

[ a, iiy = x 

Specifying types for quantifiers allows to express more 

Let aA be some arbitrary dummy global environment. If 
F is a fornmla without global variables we write J-FJcta to 
denote the truth value of F; clearly J-FJcta does not depend 
on a A and we denote it simply |F] when no ambiguity arises. 

We use Hubert's epsilon as a notational convenience in 
metatheory. If P{x) is a unary predicate, then ex.P{x) de- 
notes an arbitrary element d such that P{d) holds, if such 
element exists, or an arbitrary object otherwise. 

2.1 Term Algebra 

We introduce the notion of term algebra [22, Page 14]. 

Lot Nat be the set of natural rmmbcrs. Let the signature 
1] be a finite set of function symbols and constants and let 
ar : E ^ Nat be a function specifying arity ar(/) for every 
function symbol or constant / £ E. Let FT(E) denote the 
set of finite ground terms over signature E. We assume that 
E contains at least one constant c € S, ar(c) = 0, and at 
least one function symbol / £ E, ar(/) > 0. Therefore, 
FT(E) is count ably infinite. 



Let Cons(E) be the term algebra interpretation of signa- 
ture E, defined as follows [22, Page 14]. For every / £ E with 
ar(/) = k define |/1 € Cons(E), with |/] : FT(E)*' ^ FT(S) 
by 

lfj{ti,...,tk) = fiti,...,tk) 
We will write / instead of [/] when it causes no confusion. 

2.2 Terms as Trees 

We define trees representing terms as follows. 

We use sequences of nonegative integers to denote paths 
in the tree. Let E be a signature. A tree over E is a partial 
function t from the set Nat* of paths to the set S of function 
symbols such that: 

1. if w € Nat*, x € Nat, and t{w ■ x) is defined, then t{w) 
is defined as well; 

2. if t{w) = f with ar(/) = k, then 

{i I t{w • i) is defined } = (1, . . . , A;} 

A finite tree is a tree with a finite domain. 

2.3 First Order Structures with Pcirtial Functions 

We make use of partial functions in our quantifier elimina- 
tion procedures. In this section we briefiy describe the ap- 
proach to partial functions we chose to use; other approaches 
would work as well, see e.g. [24]. 

A language of partial functions Ei contains partial func- 
tion symbols in addition to total function symbols and rela- 
tion symbols. Consider a structure with the domain A inter- 
preting a language with partial function symbols Ei. Given 
some environment a, we have [[fjcr G A U {^} where _L ^ ^4 
is a special value denoting undefined results. We require the 
interpretations of total and partial function symbols to be 
strict in ±, i.e. /(fli, . ..,ai, ±,ai+2, . . . ,Ofc) = ±. 

We interpret atomic formulas and their negations over 
the three-valued domain {false, true, undef} using strong 
Kleene's three- valued logic [26, 24, 44]. We require that 
|-R|(ai, . . . , Oi, _L, ai+2, • • • , ak) = undef for every relational 
symbol 7?. Logical connectives in Kleene's strong three- 
valued logic are the strongest "regular" extension of the cor- 
responding connectives on the two- valued domain [26] . The 
regularity requirement means that the three-valued logic is 
a sound approximation of two-valued logic in the following 
sense. We may obtain the truth tables for three-valued logic 
by considering the truth values false, true, undef as short- 
hands for sets {false}, {true}, {false, true} and defining each 
logical operation * by: 

si 1*1 S2 = {bi o &2 I 6i £ si A 62 € 82} 

where o denotes the corresponding operation in the two- 
valued logic. As in a call-by-value semantics of lambda cal- 
culus, variables in the environments (cr) do not range over 
_L. We interpret quantifiers as ranging over the domain A 
or its subset if the logic is multisorted; the interpretation of 
quantifiers are similarly the best regular approximations of 
the corresponding two- valued interpretations. 

These properties of Kleene's three- valued logic have the 
following important consequence. Suppose that we extend 
the definition of all partial functions to make them total 
functions on the domain A by assigning arbitrary values out- 
side the original domain. Suppose that a formula (p evaluates 
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to an clement of 6 £ {false, true} in Kleeno's logic. Then (f) 
evaluates to the same truth-value b in the new logic of total 
functions. This property of three-valued logic implies that 
the algorithms that we use to transform formulas with par- 
tial functions will apply even for the logic that makes all 
functions total by completing them with arbitrary elements 
of A. 

We say that a formula tp is well-defined iff its truth value 
is an element of {false, true}. 

Example 1 Consider the domain of real numbers. The fol- 
lowing formulas are not well-defined: 

3 = 1/0 

Va;. 1/a; > V 1/a; < V 1/x = 

The following formulas are well-defined: 

3x. 1/x = 3 
Va;. 1/x / 3 
a; = V 1/a; > 

♦ 

We say that a formula is equivalent to a formula 4>2 
and write 0i = 4>2 iff 

= 14>21<J 

for all valuations cr (including those for which |</>i]cr = 
undef). 

Sections below perform equivalence-preserving transfor- 
mations of formulas. This means that starting from a well- 
defined formula we obtain an equivalent well-defined for- 
mula. 

When doing equivalence preserving transformations it is 
useful to observe that A, V still form a distributive lattice. 
The partial order of this lattice is the chain false < undef < 
true. The element undef does not have a complement in 
the lattice; unary operation -> does not denote the lattice 
complement. However, the following laws still hold: 

-i(a; A t/) = -la; V -ij/ 
V J/) = -la; A -ij/ 
-.-la; ^ X 

The properties of A, V, -> are sufiicient to transform any 
quantifier-free formula into disjunction of conjunctions of lit- 
erals using the well-known straightforward technique. How- 
ever, this straightforward technique in some cases yields con- 
junctions that are not well-defined, even though the formula 
as a whole is well-defined. 

Example 2 Transforming a negation of well-defined for- 
mula: 

-^{x / A (y = 1/a; V 2: = a; + 1)) 
may yield the following disjunction of conjunctions: 

a; = V (y / 1/x A z / a; + 1) 

where j/^l/a;A2;^x-|-lis not a well-defined conjunction 
for x = 0. 



♦ 

To enable the transformation of each well-defined for- 
mula into a disjunction of well-defined conjunctions of liter- 
als, we enrich the language of function and relation symbols 
as follows. With each partial function symbol / € Ei of 
arity k = ar(/) we associate a domain description Df = 
{{xi, . . . , Xk),4>) specifying the domain of /. Here x\, . . . ,Xk 
are distinct variables and is an unnested conjunction of 
literals such that FV((^) C [xi, . . . ,Xk}- Wo require every 
interpretation of a first-order structure with partial function 
symbols to satisfy the following property: 

|/](oi, . . . ,Ofc) _L <s=> |0][a;i H^. oi, . . . H^- Ofc] 

for all a\,...,ak £ A. We henceforth assume that every 
structure with partial functions is equipped with a domain 
description Df for every partial function symbol /. 

The Proposition 8 below gives an algorithm for trans- 
forming a given well-defined formula into a disjunction of 
well-defined conjunctions. We first give some definitions and 
lemmeis. 

Definition 3 If ip is a formula with free variables, a do- 
main formula for tp is a formula <f> not containing partial 
function symbols such that, for every valuation g , 

\^\(y 7^ undef H^l^" = true 

From Definition 3 we obtain the following Lemma 4. 

Lemma 4 Let tp be a formula and (f> a domain formula for 
ip. Then 

ip = (iph(t>)W (undef A ^0) 

Proof. Let a be arbitrary valuation. Let v = IV'l"'- K 
V e {true, false} then [^Jcr = true and 

KV) A 0) V (undef A ^ct))\cr = 

{v A true) V (undef A false) = v. 

If V = undef then |0] = false, so 

A 0) V (undef A ^(l))jcr = 
(undef A false) V (undef A true) = undef. 

■ 

Observe that ipA4>m Lemma 4 is a well-defined conjunc- 
tion. We use this property to construct domain formulas 
using partial function domain descriptions. 

Let 

Df = ((xu---,Xk},B( A...AB^f) 

for each partial function symbol / G Si of arity k, where 
B( , . . . , Bj^j are urmested literals. If ti,...,tk are terms, 

we write B{ {ti, . . . ,tk) for i?/ [xi := ti,. . . ,Xk ■= tk]- Let 
subt(f) denote the set of all subterms of term t. 

For any literal B{ti,...,tn) where B{ti,...,tn) = 
R{ti, ...,t„) or B{ti, ...,t„) = -'R{ti, . . .,tn), define 

DomForm(S(ti, . . . ,tn)) = 

A S/(si,...,Sfc) 

/(si,...,Sfc)eUi<i<„subt{ti) 

i<j<lf 
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Lemma 5 Let Biti, . . . ,tn) be a literal containing partial 
function symbols. Then DomForm(B(ti, . . . , in)) is a do- 
main formula for B(ti, . . . ,tn)- 

Proof. Let cr be a valuation. By strict- 

ness of interpretations of function and predicate sym- 
bols, IS(ti,...,i„)|(T ^ undef iff |/(si,...,Sfc)l(T ^ 

± for every subtcrm f{si,...,Sk) of every term ti, iff 
l-B^ (si, . . . , Sfe)|a = true for every 1 < j < and every 
subterm f{si, . . . ,Sk)- ■ 

Lemma 6 Let B be a literal and let 

DomForm(B) = Fi A . . . A Fm- 

Then 

B^ (B A Fi A . . . A F„) V 

Vi<i<,„(undef A A DomForm(K)) 

Proof. If IBJa ^ undef, then |Fj]cr = true for every 
1 <i < m, and 

lundef A -.F; A DomForm(Fi)](T = false 

so the right-hand side evaluates to [-B|(7 as well. Now 
consider the case when {Bja = undef. Then there exists 
a term f{si, . . . ,Sk) such that lf{si,...,Sk)}cr = undef. 
Because a{x) ^ ± for every variable x, there exists a 
term f{si,...,Sk) such that . . . , Sfe)]](T = undef and 

|[si|(7 7^ undef for 1 < i < fc. Then there exists a formula Fp 
of form Bj {si, . . . , Sk) such that |[_bJ(si, . . . , Sfc)|a = false, 
and 

[undef A -iFp A DomForm (Fp)]o- = undef. 

Because 

|B A Fi A . . . A Fmp = false, 
and for every g, 

lundef A -iF, A DomForm(Fq)]o- e {undef, false}, 

the right-hand side evaluates to undef. ■ 

Lemma 7 Let (jfoiV) and 4>i{y) be well-defined formulas 
whose free variables are among y and let 

ip{y) = (undef A^o(y)) V0i(y) 

Iftp{y) is well-defined for all values of variables y, then 

i'{y) = My) 

Proof. Consider any valuation a. Let 
and 

«' = l^iy)h 

We need to show v = v'. Because 4>{y) and ^{y) are well- 
defined, V, v' € {false, true}. We consider two cases. 
Case 1. V = true. Then also v' = true. 
Case 2. V = false. Then v' = undef A </>o(y)- Because 
v' ^ undef, we conclude v' = false. ■ 



Proposition 8 Every well-defined quantifier-free formula 
tj) can be transformed into an equivalent disjunction xp' of 
well-defined conjunctions of literals. 

Proof. Using the standard procedure, convert V to dis- 
junction of conjunctions 

Ci V . . . V c„ 

Let Ci = B f\C[ where B is a literal and let DomForm(B) = 
Fi A . . . A F^. Replace B A C • by 

(B A Fi A . . . A F„ A C7-) V 
Vi<i<m(Li"def A -.Fi A DomForm(Fi) A C'i) 

By Lemma 6 and distributivity, the result is an equivalent 
formula. Repeat this process for every literal in CiV. . .VCn- 
The result can be written in the form 

(undef A 0i) V ... V (undef A ^p) V V ... V ^p+g (2) 

where each 0, for l<i<p + (/isa well-defined conjunction. 
Formula (2) is equivalent to 

(undef A (01 V ... V 4>p)) V (t)p+i V ... V (3) 

and is equivalent to the well-defined formula i/", so it is well- 
defined. Formulas i^i V . . . V and cjip+i V ... V 4>p+q are 
also well-defined. By Lemma 7, we conclude that formula 
(3) is equivalent to 

0P+1 V ... V 0p+q (4) 

Because (4) is a disjunction of well-defined formulas, (4) is 
the desired result ij}' . ■ 

The following proposition presents transformation to 
unnested form for the structures with equality and partial 
function symbols, building on Proposition 8. For a similar 
unnested form in the first-order logic containing only total 
function symbols, see [22, Page 58]. 

Proposition 9 Every well-defined quantifier-free formula 
ip in a language with equality can be effectively transformed 
into an equivalent formula ip' where tp' is a disjunction of 
existentially quantified well-defined conjunctions of the fol- 
lowing kinds of literals: 

• R{xi, . . . , Xk) where R is some relational symbol of ar- 
ity k and xi, . . . ,Xk are variables; 

• -iR{xi, . . . ,Xk) where R is some relational symbol of 
arity k and xi, . . . ,Xk are variables; 

• xi = X2 where xi,X2 are variables; 

• X = f{xi, . . . ,Xk) where f is some partial or total func- 
tion symbol of arity k and x,xi, . . . ,Xk are variables; 

• xi ^ X2 where x\ and X2 are variables. 

Proof. Transform the formula to disjunction of well- 
formed conjunctions of literals as in the proof of Proposi- 
tion 8. 

Then repeatedly perform the following transformation 
on each well-defined conjunction cj). Let A{f{x\, . . . , xk)) be 
an atomic formula containing term /(xi, . . . ,Xk). Replace 
(j) A A{f{xi, Xk)) with 

3x0- (f) Axo = f{xi,. . . ,Xk) f\ A{xo) 
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Replace x ^ f{xi, . . . , xu) with 

xo = f{xi,. . . ,Xk) Axo X 

Repeat this process until the resulting conjunction <j}' is in 
unnested form. 0' is clearly equivalent to the original con- 
junction <j) when all partial functions are well-defined. When 
some partial function is not well-defined, then both 4> and 0' 
evaluate to false, because by construction of 4> in the proof 
of Proposition 8, each conjunction contains conjuncts that 
evaluate to false when some application of a function symbol 
is not well-defined. ■ 

Let a left-strict conjunction in Kleene logic be denoted 
by a' and defined by 

p a' g = (p A g) V (p A -ip) 

The correctness of the transformation to unnested form 
in Proposition 9 relies on the presence of conjuncts that en- 
sure that the entire conjunction evaluates to false whenever 
some term is undefined. The following Lemma 10 enables 
transformation to unnested form in an arbitrary context, al- 
lowing the transformation to unnested form to be performed 
independently from ensuring well-definedness of conjuncts. 

Lemma 10 Let <p{x) be a formula with free variable x and 
let t be a term possibly containing partial function symbols. 
Then 

1. 4>{t) ^ {3x.x = t A 0(ar)) V (undef A Vx.-.0(x)) ; 

2. 4>{t) ^ 3x. x = tA' 4>{x) ; 

3. (t>{t) ^ (3a;. x = t A ct>{x)) V {t ^ t) . 
Proof. Straightforward. ■ 

Proposition 13 below shows that a simplification similar 
to one in Lemma 7 cam be applied even within the scope 
of quantifiers. To show Proposition 13 we first show two 
lemmas. 

Lemma 11 For all formulas 4>o{x,y) and (j)i{x,y), 

3a;. (undef A (^o(2;,y)) V = 
(undef A 3x.4>o{x, y)) V 3x.(f)i{x, y) 

Proof. By distributivity of quantifiers and propositional 
connectives in Kleene logic we have: 

3a;. (undef A ^o(a;,y)) V 0i(a;,y) = 
(3a;. undef A 0o(a;, y)) V 3.t.<^i(x-, y) = 
(undef A 3x.4>o{x, y)) V 3x.4>\{x,y) 

■ 

Lemma 12 For all formulas 4>o{x,y) and 4>i{x,y), 

Va;. (undef A0o(a;,y)) V^i(a;,y) = 

(undef A Va;.0o(a;,y) V (j>i{x,y)) VVa;.0i(a;,y) 



Proof. The following sequence of equivalences holds. 

Va;. (undef A ^o(a;,^)) V 0i(a;,y) = 

-i3a;.-i(undef A 0o(a;,j7)) V 0i(a;, J/) = 

-i3x. (undef V -i0()(a;, I/)) A -i0i(x, J/) = 

-i3x-. (undef A -i0i(.T, J/)) V (-■0o(x, y) A ^<^i(x,y)) = 

-1 ((undef A 3a;.^(/)i(x, y)) V (3a;. -^(poix, y) A -^(t)i{x,y))) = 

(undef VVa;.0i(a;,y)) V (Va;. (;!>o(a;, t/) V <;!>i(a;, t/)) ^ 

(undef A Vx.4>o{x, y) V <f)i{x,y)) V \lx.(j>\{x,y) 

■ 

Proposition 13 Let 4>o{x,y) and 4>i{x,y) be well-defined 
formulas whose free variables are among y and let 

V'(y) = Qixi... QnX„. {undef A (l>o{x,y))\/ (j>i{x,y) 

where Qi, . . . , Q„ are quantifiers. Ifip{y) is well-defined for 
all values of variables y, then 

i'iy) — Qixi . . .Q„x„. (j)i{x,y) 

Proof. Applying successively Lemmas 11 and 12 to quan- 
tifiers Qn, . . . ,Qi, we conclude 

i^{y) — {undef A (j)2{y)) V Qixi .. .QnXn. (t>i{x,y) 

for some formula 4>2{y)- Then by Lemma 7, 

V'd/) - Qixi . . .QnX„. (j)i{x,y). 



3 Some Quantifier Elimination Procedures 

As a preparation for the proof of the decidability of term 
algebras of decidable theories, we present quantifier elimina- 
tion procedures for some theories that are known to admit 
quantifier elimination. We use the results and ideas from 
this section to show the new results in Sections 4, 5, 6. 

3.1 Quantifier Elimination 

Our technique for showing decidability of structural sub- 
typing of recursive types is based on quantifier elimination. 
This section gives some general remarks on quantifier elim- 
ination. 

We follow [22] in describing quantifier elimination proce- 
dures. According to [22, Page 70, Lemma 2.7.4] it suffices 
to eliminate 3y from formulas of the form 

3y. /\ i,i{x,y) (5) 

0<i<n 

where x is a tuple of variables and ipi{x, y) is a literal whose 
all variables are among x,y. The reason why eliminating 
formulas of the form (5) suffices is the following. Suppose 
that the formula in prenex form and consider the innermost 
quantifier of a formula. Let be the subformula containing 
the quantifier and the subformula that is the scope of the 
quantifier. If cj) is of the form Va;. we may replace 
with -i3a;.-i0o. Hence, we may assume that 4> is of the form 
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3x. 01. We then transform 0i into disjunctive normal form 

and use the fact 

3x. (<t)2 V <t)s) ^ (3a;. (t>2) V (3a;-. ^3) (6) 

We conclude that elimination of quantifiers from formulas of 
form (5) suffices to eliminate the innermost quantifier. By 
repeatedly eliminating innermost quantifiers we can elimi- 
nate all quantifiers from a formula. 

We may also assume that y occurs in every literal ipi, 
otherwise we would place the literal outside the existential 
quantifier using the fact 

3y. {A A B) <^ {3y.A) A B 

for y not occurring in B. 

To eliminate variables we often use the following identity 
of a theory with equality: 

3x.x = t A (t>{x) <^ <p{t) (7) 

Section 2.3 presents analogous identities for partial func- 
tions. 

Quantifier elimination procedures we give imply the de- 
cidability of the underlying theories. In this paper the inter- 
pretations of function and relation symbols on some domain 
A arc effectively computable functions and relations on A. 
Therefore, the truth-value of every formula without vari- 
ables is computable. The quantifier elimination procedures 
we present are all efi^ective. To determine the truth value of 
a closed formula it therefore suffices to apply the quan- 
tifier elimination procedure to 0, yielding a quantifier free 
formula ip, and then evaluate the truth value of ip. 

3.2 Quantifier Elimination for Boolean Algebras 

This section presents a quantifier elimination procedure for 
finite boolean algebras. This result dates back at least to 
[46], see also [51, 27, 32, 6, 49], [22, Section 2.7 Exercise 3]. 
Note that the operations union, intersection and comple- 
ment are definable in the first-order language of the subset 
relation. Therefore, quantifier elimination for the first-order 
theory of the boolean algebra of sets is no harder than the 
quantifier elimination for the first-order theory of the sub- 
set relation. However, the operations of boolean algebra are 
useful in the process of quantifier elimination, so we give the 
quantifier elimination procedure for the language containing 
boolean algebra operations. 

Instead of the first-order theory of the subtype relation 
we could consider monadic second-order theory with no re- 
lation or function symbols. These two languages axe equiv- 
alent because the first-order quantifiers can be eliminated 
from monadic second-order theory using the subset relation 
(see Section 7.1). 

Finite boolean algebras arc isomorphic to boolean alge- 
bras whose elements arc all subsets of some finite set. Wc 
therefore use the symbols for the set operations as the lan- 
guage of boolean algebras. tint2, tiUt2, ti, 0, 1, correspond 
to set intersection, set union, set complement, empty set, 
and full set, respectively. We write ti C t2 for ti Oti = ti, 
we write ti C t2 for the conjunction ti C.t2 A ti ^ t2- 

For every nonnegative integer k we introduce formulas 
|f| > A; expressing that the set denoted by t has at least 
k elements, and formulas |f| = A; expressing that the set 



denoted by t has exactly k elements. These properties are 
first-order definable as follows. 

|^| > = true 

\t\ > k+1 = 3a;. a; C t A |a;| > 

\t\ = k = |t| > fe A ^\t\ > k+1 

We call a language which contains terms \t\ > k and \t\ = k 
the language of boolean algebras with finite cardinality con- 
straints. Because finite cardinality constraints are first-order 
definable, the language with finite cardinality constraints is 
equally expressive as the language of boolean algebras. 

Every inequality ti C t2 is equivalent to the equality 
ti nt2 = ti, and every equality ts = Ia is equivalent to the 
cardinality constraint 

I(t3nt5)u(t4nt§)| =0 

It is therefore sufficient to consider the first-order formulas 
whose only atomic formulas are of the form \t\ = 0. For 

the purpose of quantifier elimination we will additionally 
consider formulas that contain atomic formulas \t\=k for all 
fc > 1, as well as \t\>k for fc > 0. 

Note that wc can eliminate negative literals as follows: 

-,|t|=A; <^ \t\ =0 y ■■■V \t\ = k-1 y \t\> k+1 

^\t\>k jtj = V • • • V |f| = fe-1 

(8) 

Every formula in the language of boolean algebras can there- 
fore be written in prenex normal form where the matrix of 
the formulas is a disjunction of conjunctions of atomic for- 
mulas of the form |t| = k and \t\ > k, with no negative 
literals. 

Note that if a term t contains at least one operation of 
arity one or more, we may assume that the constants and 
1 do not appear in t, because and 1 can be simplified away. 

Furthermore, the expression |0| denotes the integer zero, so 
all terms of form |0| = fc or |0| > fc evaluate to true or false. 
Wc can therefore simplify every noritrivial term t so that 
it either t contains no occurrences of constants and 1, or 
t= 1. 

We next describe a quantifier elimination procedure for 
finite boolean algebras. 

We first transform the formula into prenex normal form 
and then repeatedly eliminate the innermost quantifier. As 
argued in Section 3.1, it suffices to show that we can elimi- 
nate an existential quantifier from any existentially quanti- 
fied conjunction of literals. Consider therefore an arbitrary 
existentially quantified conjunction of literals 

32/- /\ ipi{x,y) 

l<i<n 

where tpi is of the form \t\ = or of the form \t\ > k. We 
assume that y occurs in every formula ipi. It follows that no 
tpi contains |0| or |1|. 

Let a;i, . . . , Xm, y be the set of variables occurring in for- 
mulas V'i for 1 < i < n. 

First consider the more general case m > 1. Let for 
ii,---,im e {0,1}, 

til.. Am — '^1 ' ' ' f"! -^m 

where =t and — t'^. The terms in the set 
P={til...^m I € {0,1}} 
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original formula 


eliminated form 


32/. 


\sny\> kA\sny'=\ 


> / 


1 1 \ 7 17 

|s| > A; + / 




s n 2/ = fc A s n 2/'' 


> I 


\s\>k + l 




|s n 2/1 > fc A |s n y^l 


= / 


\s\>k + l 


32/. 


s n 2/ = A; A s n 2/ ''I 


= / 


\s\ = k + l 



Figure 1: Rules for Eliminating Quantifiers 



form a partition; moreover every boolean algebra expression 
whose variables are among Xi can be written as a disjoint 
union of some elements of the partition P. Any boolean 
algebra expression containing y can be written, for some 
p,q >0 as 

(si n 2/) u • • • u (sp n 2/)u 

(ii n 2/"=) u ■ ■ • u {t, n y") 

where si, . . . , Sp £ P arc pairwise distinct elements from the 
partition and ti, . . . ,tq G P are pairwise distinct elements 
from the partition. Because 

|(si n 2/) U • • • U {sp n y) U (ti n I/'') u ■ ■ ■ U (i, ny")]^ 

|si n 2/1 H h |sp n 2/1 + |ti n y'^l H h \tq n y"] 

the constraint of form \t\ = k can be written as 

\J Isinyl = fei A-..A|spn:i/| = A:p A 

n 2/1 = A • ■ ■ A \tq n y'=\ = Ip 

where the disjunction ranges over nonnegative integers 
ki, . . . ,kp,li, . . . ,lq > that satisfy 

fci H \-kp + h-{ \-lq = k 

From (8) it follows that wc can perform a similar transfor- 
mation for constraints of form \t\ > k. After performing this 
transformation, we bring the formula into disjunctive nor- 
mal form and continue eliminating the existential quantifier 
separately for each disjunct, as argued in Section 3.1. We 
may therefore assume that all conjuncts are of one of the 
forms: \sny\ = fc, \s Cl y^l = k, \sny\> fc, and \s n 2/"! > fc 
where s €: P. 

If there are two conjuncts both of which contain \sny\ for 
the same s, then either they arc contradictory or one implies 
the other. We therefore assume that for any s £ P, there is 
at most one conjunct containing \s fl y\. For analogous 
reasons we assume that for every s G P there is at most one 
conjunct tpi containing \s n y'^\. The result of eliminating 
the variable y is then given in Figure 1. The case when a 
literal containing |s fl 2/| does not occur is covered by the 
case |s n 2/1 > fc for fc = 0, similarly for a literal containing 
\sny% 

It remains to consider the case m = 0. Then y is the 
only variable occurring in conjuncts ipi. Every cardinality 
expression t containing only y reduces to one of \y\ or 
If there are multiple literals containing \y\, they are either 
contradictory or one implies the others. We may therefore 
assume there is at most one literal containing \y\ and at 
most one literal containing \y'^\. We eliminate quantifier by 
applying rules in Figure 1 putting formally s = 1 where 1 is 
the universal set. 



This completes the description of quantifier elimination 
from an existcntially quantified conjunction. By repeating 
this process for all quantifiers wc arrive at a quantifier-free 
formula ?/>. Hence we have the following theorem. 

Theorem 14 For every first-order formula (j) in the lan- 
guage of boolean algebras with finite cardinality constraints 
there exists a quantifier-free formula ip such that ip is a dis- 
junction of conjunctions of literals of form \t\ > k and \t\ = fc 
where t are terms of boolean algebra, the free variables of tp 
are a subset of the free variables of 4>, and ip is equivalent to 
4> on all algebras of finite sets. 

Remark 15 Now consider the case when formula has no 
free variables. By Theorem 14, (p is equivalent to tp where ip 
contains only terms without variables. A term without vari- 
ables in boolean algebra can always be simplified to or 1. 
Because |0| = 0, the literals with |0| reduce to true or false, 
so we may simplify them away. The expression 1 1 1 evaluates 
to the number of elements in the boolean algebra. We call 
literals |1| = fc and |1| > fc domain cardinality constraints. A 
quantifier-free formula ip can therefore be written as a propo- 
sitional combination of domain cardinality constraints. We 
can simplify ip into a disjunction of conjunctions of domain 
cardinality constraints and transform each conjunction so 
that it contains at most one literal. The result ip' is a sin- 
gle disjunction of domain cardinality constraints. We may 
further assume that the disjunct of form |1| > fc occurs at 
most once. Therefore, the truth value of each closed boolean 
algebra formula is characterized by a set C of possible cardi- 
nalities of the domain. If ip' does not contain any 1 1 1 > fc lit- 
erals, the set C is finite. Otherwise, C = Co U {fc, fc + 1, . . .} 
for some fc where Co is a finite subset of {l,...,fc — 1}. 

3.3 Feferman-Vaught Theorem 

The Feferman-Vaught technique is a way of 
discovering the first-order theories of com- 
plex structures by analyzing their components. 
This description is a little vague, and in 
fact the Feferman-Vaught technique itself has 
something of a floating identity. It works 
for direct products, as we shall see. Clever 
people can make it work in other situations too. 
— [22], page 458 

Wc next review Feferman-Vaught theorem for direct 
products [14] which implies that the products of structures 
with decidable first-order theories have decidable first-order 
theories. 

The result was first obtained for strong and weak pow- 
ers of theories in [35]; [35] also suggests the generalization 
to products. Our sketch here mostly follows [14] and [35], 
see also [31, Chapter 12] as well as [22, Section 9.6]. Some- 
what specific to our presentation is the fact that we use a 
multisorted logic and build into the language the correspon- 
dence between formulas interpreted over C and the cylindric 
algebra of sets of positions. 

Let Lc be a relational language. Let further / be some 
nonempty finite or countably infinite index set. For each 
i £ I let Ci = {Ci, l-Y^'} be a decidable structure interpreting 
the language Lc. 

We define direct product of the family of structures Ci, 
i £ I, as the structure 

V = UieiCi 
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where V = (P, I-]^)- -P is the set of all functions t such that 
t{i) e Ci for i e /, and |[_|^ is defined by 

lrf{U,...M = Wi. lrf'{ti{i},...Mi)) 
for each relation symbol r € Z/c- 



inner formula relations for r € Lc 
r :: tuple*' indset 
inner logical connectives 
a', v' indset x indset — » indset 
^' indset — > indset 
true', false' :: indset 

inner formula quantifiers 
3',V' :: (tuple indset) — > indset 
index set equality 
=' indset X indset — > boo! 
logical connectives 
A,V bool X bool — > boo! 
-1 :: bool — > bool 
true, false :: bool 

index set quantifiers 
3'-,V'" :: (indset bool) bool 
tuple quantifiers 
3,V :: (tuple -> bool) bool 

Figure 2: Operations in product structure 

For the jHiriJOSc of quantifier elimination we consider a 

richer language of statements about product structure V. 
Figure 2 shows this richer language. The corresponding 
structure V-z = {P2, I-]^^) contains, in addition to the func- 
tion space P, a copy of the boolean algebra 2' of subsets of 
the index set /. We interpret a relation r € Lc by 

[rr^ (ti , . . . , tfc) = { i I [rf * (t 1 (i) , . . . , tfc (i) ) } 

We let |true'1^2 = I and write 

r(ti, ...,tk)= true' 

to express |r]^(ti, . . . , tfc)- Hence P2 is at least as expressive 
as V. 

Note that Figure 2 does not contain an equality relation 
between tuples. If we need to express the equality between 
tuples, we assume that some binary relation ro € Lc in the 
base structure is interpreted as equality, and express the 
equality between tuples ti and t2 using the formula: 

ro(ii,t2) =' true'. 

Figure 3 shows the semantics of the language in Figure 2. 
(The logic has no partial functions, so we interpret the sort 
bool over the set {true, false}.) 



inner formula relations for r G Lc 

Irr=(ti,...,t,) = {i\lTf^{U{t),...,%(i))} 
inner logical connectives 

iA'r^(Ai,A2) = ^iA^2 
iv'r^(Ai,A2) = ^iu^2 

hY'iA) = I\A 
[true'1^2 = I 
|false'l^= = 

inner formula quantifiers 

P'rV = U*ep/W 

[v'rv = n*ep/w 

index set equality 
l=YHAi,A2) = {Aj=A2) 

logical connectives 
(interpreted as usual) 
index set quantifiers 

P'-rV = U^e2^/(^) 
tuple quantifiers 

prv = 3t€P.m 

Figure 3: Semantics of operations in product structure V2 
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Wc let Ai C' A2 stand for Ai a' A2 =' A2. 

Note that the mtcrpretations of a', v', true', false', 
—\ 3'", V'" form a first-order structure of boolean algebras of 
subsets of the set /. We call formulas in this boolean algebra 
sublanguage index-set algebra formulas. 

On the other hand, relations r for r € Lc, together with 
a', v', -i\ 3', V' form the signature of first-order logic with 
relation symbols. We call formulas built only from these 
operations inner formulas. 

Let (/) be a an inner formula with free tuple variables 
ti, . . . ,tm and no free indset variables. Then (f> specifics a 
relation p C D'". Consider the corresponding first-order 
formula (/>' interpreted in the base structure C; formula (p' 
specifies a relation p' C C"" . The following property follows 
from the semantics in Figure 3: 

P{tl, ...,tm)={iel\ . . . , tmii)) } (9) 

Sort constraints imply that quantifiers 3' , V' are only applied 
to inner formulas. Let be a formula of sort bool. By la- 
belling subformulas of sort indset with variables Ai, . . . , A„, 
we can write (f> in form (f>^: 

3'-^i,...,A„. 
Ai =' 4)1 A ... A A„ =' (f)„ A 

^{Al,...,Ar^) 

where 

cj) = ■0(01, ...,(t>n) 

Furthermore, by defining B\, . . . , Bm to be the partition of 
true' consisting of terms of form 

ylf a'...a'<" 

for pi, . . . ,p„ £ {0, 1}, we can find a formula ip' and formulas 
. . . , cj)!m such that cj)^ is equivalent to cjy^: 

3'"Bl, . . . , Bm. 
Bi='cj>[ A ... A Bm =' ct>'m A (10) 
V''(Si,...,B„) 

and where <f>'i, . . . , (j)'m evaluate to sets that form partition of 
true' for all values of free variables. (By partition of true' we 
here mean a family of pairwise disjoint sets whose union is 
true', but we do not require the sets to be non-empty.) 

Now consider a formula of form 3t.<f> where <f> is with- 
out 3,V quantifiers (but possibly contains 3',V' and 3'",V'" 
quantifiers). We transform (f> into 0^ as described, and then 
replace 

3t. 3'"_Bi, . . . , Bm- 

=' 0'i A ... A Bm =' <i>'m A (11) 
^'{Bl,...,Bn) 

with 

3'"£'l, . . . , Dm. 3'"Bl, . . . , Bm- 

Di =' (3'i.0'i) A ... A Dm= {3h.<P'm) A ^^^^ 
BiC' Di A ... A BmC' Dm A 
partition(Bi, . . . , B„) A V'(Si, . . . , B„) 



where partition (Bi, . . . ,Bn) denotes a boolean algebra ex- 
pression expressing that sets Bi, . . . , B„ form the partition 
of true'. 

It is easy to sec that 11 and 12 are equivalent. 

By repeating this construction we eliminate all term 
quantifiers from a formula. We then eliminate all set quan- 
tifiers as in Section 3.2. For that purpose we extend the 
language with cardinality constraints. 

As the result we obtain cardinality constraints on inner 
formulas. Closed inner formulas evaluate to true' or false' 
depending on their truth value in base structure C. Hence, 
if C is decidable, so is V2. 

Theorem 16 (Feferman-Vaught) Let C be a decidable 
structure. Then every formula in the language of Figure 2 is 
equivalent on the structure P2 to a propositional combination 
of cardinality constraints of the index-set boolean algebra i. e. 
formulas of form \4>\ > k and \4>\ = k where 4> is an inner 
formula. 

Example 17 Let r £ Lc be a binary relation on structure 
C. Let us eliminate quantifier 3t from the formula 0(ti,t2): 

3t.3'-Ai,A2,A^.i. 
Ai^'r{t,ti) A A2^'r{ti,t) A A3 = r(t2,t) A 
h'^i| = A h'^2|=0 Ah'Ail >1 

We first introduce sets Bo,...,B7 that form partition of 
true'. The formula is then equivalent to 0i: 

3t.3'"Bo, B\,B2, BzjBa, Bz,B^, Br. 
Bo=' r{t,ti) a' r{ti,t) A'r(t2,t) A 
Bi =' -.'r(t,ti) a' r{ti,t) a' r{t2,t) A 
B2 =' r{t,ti) a' -''r{ti,t) a' r{t2,t) A 
B3 =' -.'r(t,ti) a' -fr{ti,t) a' r{t2,t) A 
Bi =' r{t,ti) a' r(ti,t) a' -i'r(f2,i) A 
-B5 = -'^rit.ti) a' r(ti,t) a' -fr(t2,i) A 
Bfi =' rit.ti) a' -fr(ti,t) a' -fr(t2,t) A 
Br =' -'W(t,ti) a' -''r(ti,t) a' -fr{t2,t) A 

00 

where 

00 = 

|Bi| = A IB2I = A 
l_B3l = A IB5I = A 
IBel = A IB7I = A 
\Bi\ > 1 

We now eliminate the quantifier 3t from the formula 0i, 
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obtaining formula 02: 

B'-Do, Di, D2, Ds, D4, D5, L>6, Dr. 



Do =' 


3't. 


rff ti ) a' r 


fti t") a' r(ti t) A 


Di = 


3't. 




riU t") a' rffo t\ A 


D2 =' 


3't. 


r(t,ti) a'- 


V(ti,i) a' r(i2,i) A 


D3 = 


^1 , 
3 t. 


-n'r{t,ti) A 


^ 'r(ti,t) A r(t2,t) A 


D4 = 


34. 


'r(t,ti) a' r 


a' ^''r(t2,t) A 


Dr, = 


3't. 


-^'r{t,ti) A 


a' -^r{t^,t) A 


= 


3't. 


r(t,ti) a' - 


'r(ti,t) a' ^'r(t2,t) A 


D7 =' 

03 


3't. 


-^'r{t,ti) A 


^'r(ii,t)A' -''r{t2,t) A 



where 

03 = 3'"i3o, Bi, -B2, B3, -B4, Bs, Be, 57. 
BqC' Do A ... A BrC'DrA 

00 

We next apply quantifier elimination for boolean algebras 
to formula 03 and obtain formula 03: 

03 = \D4\ > 1 A \-fDo a' -fDi] = 

Hence 0(ti,t2) is equivalent to 

3'-Do,D4. 

i3o =' 3't. r(t,ti) a' r(ti,t) a' r{t2,t) A 
r>4 =' 3't. r(t,ti) a' r(ti,t) a' -''r(t2,i) A 
ID4I > 1 A h'Do a' -fD4\ = 

After substituting the definitions of Do and D4, formula 
4>{t\,t2) can be written without quantifiers 3,V, 3'",V'". 

♦ 

3.4 Term Algebreis 

In this section we present a quantifier elimination procedure 
for term algebras (see Section 2.1). A quantifier elimination 
procedure for term algebras implies that the first-order the- 
ory of term algebras is dccidablc. In the sections below wo 
build on the procedure in this section to define quantifier 
elimination procedures for structural subtyping. 

The decidability of the first-order theory of term alge- 
bras follows from Mal'cev's work on locally free algebras 
[31, Chapter 23]. [39] also gives an argument for decid- 
ability of term algebra and presents a unification algorithm 
based on congruence closure [38]. Infinite trees are studied 
in [12]. [30] presents a complete axiomatization for algebra 
of finite, infinite and rational trees. A proof in the style of 
[22] for an extension of free algebra with queues is presented 
in [43]. Decidability of an extension of term algebras with 
membership tests is presented in [10] in the form of a termi- 
nating term rewriting system. Unification and disunification 



problems arc special cases of decision problem for first-order 
theory of term algebras, for a survey see e.g. [45, 9]. 

Wc believe that our proof provides some insight into 
different variations of quantifier elimination procedures for 
term algebras. Like [22] we use selector language symbols, 
but retain the usual constructor symbols as well. The ad- 
vantage of the selector language is that 3y. z = f{x,y) is 
equivalent to a quantifier-free formula x = fi{z) A ls/(2). 
On the other hand, constructor symbols also increase the 
set of relations on terms definable via quantifier-free formu- 
las, which can slightly simplify quantifier-elimination pro- 
cedure, as will be seen by comparing Proposition 34 and 
Proposition 38. Compared to [22, Page 70], we find that the 
termination of our procedure is more evident and the ex- 
tension to the term-power algebra in Section 6 easier. Our 
base formulas somewhat resemble formulas arising in other 
quantifier elimination procedures [31, 11, 30]. Our terminol- 
ogy also borrows from congruence closure graphs like those 
of [39, 38], although we are not primarily concerned with 
efficiency of the algorithm described. Term algebra is an ex- 
ample of a theory of pairing functions, and [15] shows that 
non-empty family of theories of pairing functions as non- 
elementary lower bound on time complexity. 

3.4.1 Term Algebra in Selector Language 

To facilitate quantifier elimination we use a selector lan- 
guage Sel(S) for term algebra [22, Page 61). We define term 
algebra in selector language as a first-order structure with 
partial functions. 

The set Sel(E) contains, for every function symbol / € 
E of arity ar(/) = fc, a unary predicate Is/ C FT(E) and 
functions fi, . . . , fk : FT(E) FT(S) such that 

ls/(t) <^ 3tu...,tk. t = f{ti,...,t4l3) 
fi{f{ti,...,tk)) = ti, l<i<k (14) 
Mt) = ±, -^\Sf{t) (15) 

For every / G E and 1 < i < ar(/), expression fi{t) defined 
iff \sf{t) holds, so we let Df = {x, \sf{x)). 

As a special case, if d is a constant, then ar(d) = and 
\Sd{t) ^ t^d. 

Proposition 18 For every formula 0i in the language 
Cons(S) there exists an equivalent formula 4>2 in the selector 

language. 

Proof Sketcli. Because of the presence of equality sym- 
bol, every formula in language Cons(E) can be written in 
unnested form such that every atomic formula is of two 
forms: xi = X2, or /(xi, . . . ,Xk) = y, where y and Xi are 
variables. We keep every formula xi = X2 unchanged and 
transform each formula 

f{xi,...,xk) =y 

into the well-defined conjunction 

xi=fi{y) A •••A Xk = fk{y) A ls/(y) 

■ 

Note that predicates Is/ form a partition of the set of all 
terms i.e. the following formulas are valid: 

Vx. V Is/ (a;) 

/6S (16) 
Vx. ^(Is/(a;) A Is9(a;)), lor f ^ g 
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Figure 4: Quantifier Elimination for Term Algebra 



A constructor-selector language contains both construc- 
tor symbols / £ Cons(E) and selector symbols fi G Sel(E). 

3.4.2 Quantifier Elimination 

We proceed to quantifier elimination for term algebra. A 
schematic view of our proof is in Figure 4. The basic in- 
sight is that any quantifier-free formula can be written in a 
particular unnested form, as a disjunction of base formulas. 
Base formulas trivially permit elimination of an existential 
quantifier, yet every base formula can bo converted back to 
a quantifier- free formula. 

A semi-base formula is almost the base formula, except 
that it may be cyclic. We introduce cyclicity after explaining 
the graph representation of a semi-base formula. 

Definition 19 (Semi-Base Formula) A semi-base for- 
mula (3 with 

• free variables xi, . . . , Xm, 

• internal non-parameter variables ui, . . . ,Up, and 

• internal parameter variables Up+i, . . . , Up+q 
is a formula of form 

3mi, . . . ,u„ 

distinct(wi, . . . , Un) A 
structure(wi, . . . , m„) A 
labels(ui, . ..,«„; a;i, .. . ,x,n) 

distinct(iti, . . . ,Un) enforces that variables are distinct 
distinct(wi, . . . , u„) = Ui ^ Uj . 

l<i<j<n 

structure(Mi, . . . , M„) specifies relationships between terms 
denoted by variables: 

structure(wi, . . . ,u„) = 

V 

l\ Ui = ti{ui, . . . ,Un) 
i=l 

where each ti{ui, . . . , m„) is a term of form f{uii , . . . ,ui^) 
/or / e E, fe = ar(/). 

Iabels(wi xi , ... , Xm) identifies some free vari- 

ables with some parameter and non-parameter variables: 

labels(wi,...,w„;xi,...,a;rre) = f\ Xi = Uj^ 

l<i<m 



for some function j : {1, . . . , m} {1, . . . , n}. 

We require each semi-base formula to satisfy the follow- 
ing congruence closure property: there are no two distinct 
variables Ui and Uii such that both Ui = f{uii, . . . ,ui^) 
and Uii = f{ui^ , . . . ,ui^) occur as conjuncts 4>j in formula 
structure. 

We denote by U the set of internal variables of a given 
semi-base formula, U = {ui, . . . , m„}. 

Definition 20 A semi-base formula in selector language is 
obtained from the base formula in constructor language by 
replacing every conjunct of form 

Ui = /(Mil J 

with the well-defined conjunction 

iSf{ui) A ui^ = fi{ui) A •••A ui^=fk{ui) 

A semi-base formula in selector language is clearly a well- 
formed conjunction of literals. All atomic formulas in a semi- 
base formula are unnested, in both constructor and selector 
language. 

We can represent a base formula as a labelled directed 
graph with the set of nodes U ; we call this graph graph as- 
sociated with a semi-base formula. Nodes of the graph are 
in a bijection with internal variables of the semi-base for- 
mula. We call nodes corresponding to parameter variables 
Up+i, . . . , Up+q parameter nodes; nodes ui, . . . ,Up are non- 
parameter nodes. Each non-parameter node is labelled by 
a function symbol / € E and has exactly ar(/) successors, 
with edge from Uk to ui labelled by the positive integer i 
iff fi{uk) = ui occurs in the semi-base formula written in 
selector language. A constant node is a node labelled by 
some constant symbol c G E, ar(c) = 0. A constant node 
is a sink in the graph; every sink is cither a constant or a 
parameter node. In addition to the labelling by function 
symbols, each node w G [/ of the graph is labelled by zero 
or more free variables x such that equation x = u occurs in 
the semi-base formula. 

Definition 21 (Base Formula) A semi-base formula (j) is 
a base formula iff the graph associated with (j) is acyclic. 

A semi-base formula whose associated graph is cyclic is un- 
satisfiable in the term algebra of finite terms. Checking the 
cyclicity of a base formula corresponds to occur-check in 
unification algorithms (see e.g. [29, 11]). 

Definition 22 By height Tl{u) of a node u in the acyclic 
graph we mean the length of the longest path starting from 
u. 

A node u is sink iff 'H{u) = 0. 

Definition 23 We say that an internal variable ui is a 
source variable of a base formula j3 iff ui is represented by 
a node that is source m the directed acyclic graph corre- 
sponding to p. Equivalently, if (3 is written in the selector 
language, then ui is a source variable iff (3 contains no equa- 
tions of form ui = fi(uk). 

Definition 24 Ifm anduj are internal variables, we write 
Ui ^* Uj if there is a path in the underlying graph from node 
Ui to node Uj. Equivalently, Ui ^* uj iff there exists a term 
t{ui) in the selector language such that \= [3 ^ Uj = t{ui) . 
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Relation ^* is a partial order on internal variables of 13. 

The following Lemma 25 is similar to the Independence 
of Disequations Lemma in e.g. [10, Page 178]. 

Lemma 25 Let P he a hose formula of the form 

3W1, . . . , Up, Mp+l, . . . , Up+q. Po 

where Up+i, . . . , Up+q are parameter variables of (3, and f3o is 
quantifier- free. Let Sp+i, . . . , Sp+q be infinite sets of terms. 
Then there exists a valuaiion a such that |/3o]cr = true and 

lu4a e Si for p + 1 < I < p + q. 

Proof. To construct a assign first the values to parameter 
variables, as follows. Let ha be the length of the longest 
path in the graph associated with /3. Pick cr(Mp+i) € Sp+i so 
that h(cr(wp+i)) > ho, and for each i where p + 2 <i< p + q 
pick a{ui) € Si so that h{a{ui)) > h(cr(Mi_i)) + /ig- The 
set of heights of an infinite set of terms is infinite, so it is 
always possible to choose such a{ui). 

Next consider internal nodes ui, . . . , Up+q in some topo- 
logical order. For each non-paranictcr node Ui such 
that Ui — f{ui-^,...,uii.) occurs in /3o, lot a{ui) = 
f{aiui,),...,a{uij). 

Finally assign the values to free variables by a{x) = a{u) 
where x = u occurs in /3o. 

By construction, [structural cr = true and |labels|a = 
true. It remains to show fdistinctjcr = true i.e. a{ui) / cr(ttj) 
for 1 < i,j < p + q, i j . We show this property of a 
by induction on m = mm{Ti.{ui),Ti.{uj)). Without loss of 
generality wo assume Ti.{ui) < 7t{uj). 

Consider first the case m = 0. Then m is a parameter 
or a constant node. 

If Ui is a constant and Uj is a non-parameter variable 
then Ui and Uj are labelled by different function symbols so 
cr{ui) ^ (T{uj). 

If Ui is a constant and Uj is a parameter variable then 
h{a{ui)) = whereas h{a{uj)) > ho > 0. 

Consider the case where Ui is a parameter variable and 
Uj is a non-parameter variable. Let 

J = {ji I 'iji is a parameter variable s.t. Uj >-^* Uj^} 

If J = 0, then /3o uniquely specifies cr(uj), and 

h{a{uj)) = H{uj) <hG< h{a{ui)) 

Let J =/^% and jo = max J. If i < jo , then 

h(ff(Wi)) < h(f7(M,J) < h{ff{Uj)) 

If Jo < i then 

\\{a{Uj)) < \\{(7{Uja)) +hG < \\{(7{Uja+l)) < \\{(7{Ui)) 

Now consider the case m > 0. Ui and Uj are non- 
parameter nodes, so let w, = /{ui^, . . . jUi,,) and Uj = 
g{uji , . . . ,Uji). If / ^ 5 then clearly a{ui) =^ <y(uj). Other- 
wise, by congruence closure property of base formulas, there 
exists d such that w,^ / Uj^ . Then by induction hypothesis 
f^Kd) 7^ so a{ui) ^ a{uj). ■ 

Corollary 26 Every base formula is satisfiable. 

Proposition 27 (Quantification of Beise Formula) // 

P is a base formula and x a free variable in (3, then there 
exists a base formula (3\_ equivalent to 3x.f3. 



Proof. Consider a formula 3x.l3 where /3 is a base formula. 
The only place where x occurs in /3 is a:: = Us^ in the subfor- 
mula labels. By dropping the conjunct x = Us-^ from 3 we 
obtain a base formula /3i where /9i is equivalent to 3x.(3. ■ 

Proposition 28 (Quantifier-Free to Base) Every well- 
defined quantifier-free formula in constructor-selector lan- 
guage can be written as true, false, or a disjunction of base 

formulas. 

Proof Sketch. Let be a well-defined quantifier-free 
formula in constructor-selector language. By Proposition 8 
we can transform 4> into an equivalent formula in disjunctive 
normal form 

V • • • V Vp 

where each i/>i is a well-defined conjunction of literals. Con- 
sider an arbitrary ipi. There exists an unnested quantifier- 
free formula ip^ with additional fresh free variables xi,. . . ,Xq 
such that xpi is equivalent to 

3X1, ...,Xq.1p'i 

By distributivity and (6) it suffices to transform each con- 
junction of unnested formulas into disjunction of base for- 
mulas. In the sequel we will assume transformations based 
on distributivity and (6) are applied whenever we transform 
conjunction of literals into a formula containing disjunction. 
We also assume that every equation f(x-i, . . . ,Xn) — y is 
replaced by the equivalent one y — f{xi, . . . , x„) and every 
equation fi{x) = y is replace hy y = fi{x). 

Because of our assumption that E is finite, wc can elim- 
inate every literal of form ^Is/(.t) using the equivalence 

-^\Sf{x) ^ V \sq{x) (17) 

96S\{/} 

which follows from (16). Wc then transform formula back 
into disjunctive normal form and propagate the existential 
quantifiers to the conjunctions of literals. We may therefore 
assume that there are no literals of form -ils/(a;) in the con- 
junction. Furthermore, \sf{x) A lsg(a;) <;=^ false for f ^ g, 
so we may assume that for variable x there is at most one 
literal Is/ (a;) for some /. If fi{x) occurs in the conjunction, 
because the conjunction is well-defined, we may always add 
the conjunct Is/ (a:). This way we ensure that exactly one 
literal of form Is/ (a;) occurs in the conjunction. 

We next ensure that every variable has either none or 
all of its components named by variables. If the conjunction 
contains literal Is / (x) but does not contain x = f{xi , . . . , Xn) 
and does not contain an equation of form y — fi(x) for 
every «,!<«< s^if), we introduce a fresh cxistcntially 
quantified variable for each i such that a term of form y = 
fi{x) does not appear in the conjunction. At this point 
we may transform the entire conjunction into constructor 
language by replacing 

ls/(ui) A = fi{ui) A ■ ■ ■ A vi^ = fk{ui) 

with Ui = f{vi^, ... ,vi^) for fc = ar(/). 

We next ensure that for every two variables xi and X2 
occurring in the conjunction exactly one of the conjunct 
xi = X2 or a;i ^ X2 is present. Namely if both conjuncts 
xi = X2 and a;i ^ X2 are present, the conjunction is false. 
If none of the conjuncts is present, we insert the disjunction 
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xi = X2 \l x\ ^ X2 as one of the conjuncts and transform 
the result into disjunction of oxistontially quantified con- 
junctions. 

We next perform congruence closure for finite terms [38] 
on the resulting conjunction, using the fact that equality is 
reflexive, symmetric, transitive and congruent with respect 
to free operations / € Cons(E) and that t{x) ^ x for every 
term t ^ x. Syntactically, the result of congruence closure 
can be viewed as adding new equations to the conjunction. 
If the congruence closure procedure establishes that the for- 
mula is unsatisfiablc, the result is false. Otherwise, all vari- 
ables arc grouped into equivalence classes. If a ui = 112 
occurs in the conjunction where both ui and U2 are internal 
variables, we replace Ui with U2 in the formula and elim- 
inate the existential quantifier. If for some free variable x 
there is no internal variable u such that conjunction x = u 
occurs, we introduce a new existentially quantified variable 
and a conjunct x = u. These transformations ensure that 
for every equivalence class there exists exactly one internal 
variable in the formula. It is now easy to pick representative 
conjuncts from the conjunction to obtain conjunction of the 
syntactic form in Definition 19 of semi-base formula. The 
resulting formula is a base formula because congruence clo- 
sure algorithm ensures that the associated graph is acyclic. 
■ 

We next turn to the problem of transforming a base for- 
mula into a quantifier-free formula. We will present two 
constructions. The first construction yields a quantifier-free 
formula in constructor-selector language and is sufficient for 
the imrposc of quantifier elimination. The second construc- 
tion yields a quantifier-free formula in selector language and 
is slightly more involved; we present it to provide additional 
insight into the quantifier elimination approach to term al- 
gebras. 

We first introduce notions of covered and determined 
variables of a base formula /3. The basic idea behind these 
notions is that /? implies a functional dependence from the 
free variables of (3 to each of the determined variables. 

In both constructions we use the notion of a a covered 
variable, which denotes a component of a term denoted by 
some free variable. In the first construction we also use the 
notion of determined variable, which includes covered vari- 
ables as well as variables constructed from covered variables 
using constructor operations / £ Cons(S). 

Definition 29 Consider an arbitrary base formula p. We 
say that an internal variable u is covered by a free variable 
X iff X = u' occurs in (3 for some v! such that u >-^* u' . An 
internal variable u is covered iff u is covered by x for some 
free variable x (in particular, if x = u occurs in p then u 
is covered). Let covered denote the set of covered internal 
variables of base formula, and let uncovered = U \ covered 
where U is the set of all internal variables of (3. 

Lemma 30 (Covered Base to Selector) Every base 
formula without uncovered variables is equivalent to a 
quantifier free formula in selector language. 

Proof. Consider a base formula /3 where every variable 
is covered. Consider an arbitrary quantified variable u. Be- 
cause u is covered, there exists variable x free in /3 such that 

u = t{x) for some term t in the selector language. Replace 
every occurrence of u in the matrix of /3 by t{x) and elim- 
inate the quantification over u. Repeating this process for 



every variable u we obtain a quantifier-free formula equiva- 
lent to p. m 

Definition 31 Let P be a base formula. The set determined 

of determined variables of P is the smallest set S that con- 
tains the set covered and satisfi^es the following condition: 
if u IS a non-parameter node and all successors ui, . . . ,Uk 
(k > 0) of u in the associated graph are in S, then u is also 
in S. 

In particular, every constant node is determined. A param- 
eter node w is determined iff w is covered. 

Lemma 32 If a node u is not determined, then there exists 
an uncovered parameter node v such that u >-** v. 

Proof. The proof is by induction on Ti.{{)u). If Ti.{{)u) = 
then u has no successors, and u cannot be a constant node 

because it is not determined. Therefore, m is a parameter 
node, so we may let v = u. Assume that the statement 
holds for for every node u such that Ti.{{)u') — k and lot 
Ti{{)u) = fc + 1. Because u is not determined, there exists 
a successor u' of u such that u' is not determined, so by 
induction hypothesis there exists an uncovered parameter 
node V such that u' >-^* v. Hence u >-^* u' >-^* v. m 

Lemma 33 Every base formula P is equivalent to a base 
formula P' obtained from P by eliminating all nodes that are 

not determined. 

Proof. Construct p' from p by eliminating all terms 
containing a variable u € U \ determined and eliminating 
the corresponding existential quantifiers. Then all variables 
in p' are determined. P' has fewer conjuncts than p, so 

\= P ^ P' . To show \= P' ^ P, let fj be any assignment 
of terms to determined variables of P such that P evaluate 
to true under a. As in the proof of Lemma 25, define the 
extension a' of a as follows. Choose sufficiently large values 
(t'{v) for every uncovered sink variable v, so that a" defined 
as the unique extension of a' to the remaining undetermined 
variables assigns different terms to different variables. This 
is possible because the term model is infinite. The result- 
ing assignment a" satisfies the matrix of the base formula 
p. Therefore, \= P' =^ P, so P and P' are equivalent base 
formulas. ■ 



First Construction 

Proposition 34 (Base to Constructor-Selector) 

Every base formula P is equivalent to a quantifier-free 
formula 4> in constructor-selector language. 

Proof. By Lemma 33 we may assume that all variables 
in P are determined. To every variable u we assign a term 
t{u). Term t{u) is in constructor-selector language and the 
variables of r(w) are among the free variables of /3. If -u G 
covered, we assign t{u) as in the proof of Lemma 30. If 
ui,...,Uk are the successors of a determined node u, we 
put 

t{u) = /(r(ui),...,r(itfe)) 

where / is the label of node u. This definition uniquely 
determines r(w) for all u € determined. We obtain the 
quantifier-free formula (p by replacing every variable u with 
t{u) and eliminating all quantifiers. 



14 



For every u wc have \= (3 =^ u = t(u), so |= /3 => 0. 
Conversely, if (j> is satisfied then t defines an assignment for 
u variables which makes the matrix of p true. Therefore (3 
and are equivalent. ■ 

Second Construction The reason for using constructor 
symbols / € Cons(E) in the first construction is to pre- 
serve the constraints of form u ^ v when eliminating node 
u with successors mi , . . . , Wfc . Using constructor symbols we 
would obtain the constraint /(mi, . . . , Uk) 7^ v. Our second 
construction avoids introducing constructor operations by 
decomposing /(t/-i, . . . ,Uk) ^ v into disjunction of inequal- 
ities of form Ui ^ fi(y). When u is a parameter node, the 
presence of term fi{v) potentially requires introducing a new 
node in the associated graph, we call this process parame- 
ter expansion. Parameter expansion may increase the total 
number of nodes in the graph, but it decreases the num- 
ber of uncovered nodes, so the process of converting a base 
formula to a quantifier-free formula in the selector language 
terminates. 

Lemma 35 Let (3 be an arbitrary base formula. 

1. If u is covered and u >—** u' then u' is covered as well. 

2. If u' is uncovered and u' is not a source, then there 
exists u ^ u' such that u ^* u' and u is also uncovered. 

3. If P contains an uncovered variable then /3 contains an 
uncovered variable that is a source. 

Proof. By definition. ■ 

Parameter Expansion Wc define the operation of ex- 
panding a parameter node iu a base formula as follows. Let 
(3 be an arbitrary base formula and w a parameter variable 
in /3. The result of expansion of w is a disjunction of base 
formulas (3' generated by applying (13) to w. In each of 
the resulting formulas /3' variable w is not a parameter any 
more. Each /3' contains ls/(w) for some / £ E and node it; 
has successors mi, . . . , for k = ar(/). Each successor Ui is 
cither an existing internal variable or a fresh variable. For a 
given l3, sink expansion generates disjunction of formulas /?' 
for every choice of / G E and every choice of successors Ui , 
subject to congruence closure so that (3' is a base formula: 
we discard the choices of successors of w that yield formulas 
f3' violating congruence of equality. (This process is simi- 
lar to converting quantifier-free formulas into disjunction of 
base formulas in the proof of Proposition 28.) The following 
lemma shows the correctness of parameter expansion. 

Lemma 36 (Parameter expansion soundness) Let 

A = j3[ V ■ ■ ■ f3'f. be the disjunction generated by parameter 
expansion of a base formula (3. Then A is equivalent to [3. 

Lemma 36 justifies the use of parameter expansion in the 
following Lemma 37. 

Lemma 37 Every base formula (3 can be written as a dis- 
junction of base formulas without uncovered variables. 

Proof Sketch. By Lemma 33 we may assume that all vari- 
ables of /3 are determined. Suppose f3 contains an uncovered 
variable. Then by Lemma 35, l3 contains an uncovered vari- 
able Wo such that mo is a source. Because mo is uncovered 



and determined, it is not a parameter node. We show how to 
eliminate uo without introducing now uncovered variables. 

Our goal is to eliminate Mo from the associated graph. 
We need to preserve information that uo is distinct from 
variables u £ U \ {wo} in the graph. We consider two cases. 

If u is not a parameter node, then by congruence closure 
either uo and u are labelled by different function symbols, 
or they are labelled by the same function symbol / G E 
with ar(/) = k and there exists i, 1 < i < k and variables 
Ui = fiiuo) and u'i — fiiu') such that Ui ^ u'i. Hence the 
constraint uq 7^ m is deducible from the inequalities of other 
variables in 13 and we can eliminate uo without changing the 
truth value of [3. 

Next consider the case when u is a parameter node. By 
assumption u is determined, and because it is parameter, it 
is covered. We then perform parameter node expansion as 
described above. The result of elimination of uq in /3 is a 
disjunction of base formulas /?', in each /3' every parameter 
node is expanded. If it is a parameter node in (3 then the 
constraint tto 7^ tt is preserved in each /3' because u is not a 
parameter node in /3' so the previous argument applies. 

Because the parameter nodes being expanded are cov- 
ered, so are their successor nodes introduced by parameter 
expansion. Therefore, by repeatedly applying elimination 
of uncovered variables for every uncovered variable uo, we 
obtain a disjunction A of formulas (3' where each /3' has no 
uncovered variables, and A is equivalent to f3. ■ 

Proposition 38 (Base to Selector) For every base for- 
mula (3 there exists an equivalent quantifier-free formula tp 
in selector language. 

Proof. By Lemma 37, (3 is equivalent to a disjunction 
I3\ W ■ ■ ■ y I3n where each Pi has no uncovered variables. By 
Lemma 30, each Pi is equivalent to some quantifier free for- 
mula ipi, so P is equivalent to the quantifier- free formula 
V'l V • • • V Vn- ■ 

The final theorem in this section summarizes quantifier elim- 
ination for term algebra. 

Theorem 39 (Term Algebra Quantifier Elimination) 

There exist algorithms A, B, G such that for a given formula 
4> in constructor-selector language of term algebras: 

a) A produces a quantifier-free formula 0' in constructor- 
selector language 

b) B produces a quantifier-free formula 0' in selector lan- 
guage 

c) C produces a. disjunction <j)' of base formulas 
Proof, a): Transform formula (j> into prenex form 

QlXl . . . Qn-lX„-lQ„Xn.(l>* 

whore 0* is quantifier free, as in Section 3.1. We eliminate 
the innermost quantifier Qn as follows. 

Suppose first that Qn is 3. Transform the matrix (f>* into 
disjunctive normal form Ci V ■ ■ • V €'„,■ By Proposition 28, 
transform Ci V ■ ■ ■ V d into disjunction pi V • ■ ■ V Pm of base 
formulas. Then propagate 3 into individual disjuncts, using 

3X„. /3l V • • • V /3m <S=^ {3Xn.Pl) V • • • V {3Xn.Pm) 
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By Proposition 27, an oxistontially quantified base formula 
is again a base formula, so 3xn-/3i <==^ /3'i for some Pi. We 
thus obtain the 

Qia;i...Q„-ix„-i. V-V/?;, (18) 

By Proposition 34, every base formula is equivalent to a 
quantifier-free formula in selector language, so 18 is equiva- 
lent to 

QlXl . . . Qn-lXn-l-1p 

where i/' is a quantifier free formula. Hence, we have elimi- 
nated the innermost existential quantifier. 

Next consider the case when Q„ is V. Then cj) is equiva- 
lent to 

Apply the procedure for eliminating a;„ to -xj)* . The result 
is formula of form 

QlXl . . . Qn-ia^n-i- -'V' (19) 

where V' is quantifier free. But -i^/' is also quantifier free, so 
wc have eliminated the innermost universal quantifier. By 
repeating this process wc eliminate all quantifiers, yielding 
the desired formula 0', 

The direct construction for showing h) is analogous to 
a), but uses Proposition 38 in place of Proposition 34. To 
show c), apply e.g. construction o) to obtain a quantifier- 
free formula and then transform into disjunction of base 
formulas using Proposition 28. ■ 

This completes our description of quantifier elimination 
for term algebras. 

We remark that there are alternative ways to define base 
formula. In particular the requirement on disequality of all 
variables is not necessary. This requirement may lead to 
unnecessary case analysis when converting a quantifier-free 
formula to disjunction of base formulas, but we believe that 
it simplifies the correctness argument. 

4 The Pair Constructor and Two Constants 

In this section wc give a quantifier elimination procedure for 
structural subtyping of non-recursive types with two con- 
stant symbols and one covariant binary constructor. Two 
constants corresponds to two primitive types; one binary 
covariant constructor corresponds to the pair constructor 
for building products of types. 

The construction in this section is an introduction to 
the more general construction in Section 5, where we give a 
quantifier elimination procedure for any number of constant 
symbols and relations between them. The construction in 
this section demonstrates the interaction between the term 
and boolean algebra components of the structural subtyping. 
We therefore believe the construction captures the essence 
of the general result of Section 5. 

The basic observation behind the quantifier elimination 
procedure for two constant symbols is that the structure of 
terms in this language is isomorphic to a disjoint union of 
boolean algebras with some additional term structure con- 
necting elements from different boolean algebras. As wc ar- 
gue below, the structural subtyping structure contains one 
copy of boolean algebra for every equivalence class of terms 
that have the same "shape" i.e. are same up to the constants 
in the leaves. 



Consider a signature S = {a, h, where a and 6 are 
constant symbols and g is a function symbol of arity 2. We 
define a partial order < on the set FT(E) of ground terms 
over S as the least reflexive partial order relation p satisfying 

1. aph; 

2. (siPtl) A {S2Pt2) ^ g(si,S2)Pg{ti,t2). 

The structure with equality in the language {a,b, g, <}, 
where < is interpreted as above and a, b, g are interpreted 
as free operations on term algebra corresponds to the struc- 
tural subtyping with two base types a and b and one binary 
type constructor g, with g covariant in both arguments. We 
denote this structure by BS. We proceed to show that BS 
admits quantifier elimination and is therefore decidable. 

4.1 Boolean Algebras on Equivalent Terms 

In preparation for the quantifier elimination procedure we 
define certain operations and relations on terms. We also 
establish some fundamental properties of the structure BS. 

Define a new signature Eo = {c^,g'^} as an abstraction of 
signature E = {a,b,g}. Define function shapified : E ^ Eo 
by 

shapified (a) = d 
shapified (6) = 
shapified (gf) = 

Let ar(shapified(/)) = ar(/) for each / € E; in this case c' is 
a constant and g^ is a binary function symbol. Let FT(Eo) 

be the set of ground terms over the signature So. Define 
shape of a term t, as the function sh : FT(E) — > FT(Eo), by 
letting 

Sh(/(ii,...,tfc)) = 

shapified(/)(sh(ti), . . . , sh(tfe)) 
for k = ar(/). In this case we have 

sh(o) = c= 
sh(6) = c= 
Sh{g{tl,t2)) = ff=(sh(ti),sh(i2)) 

Define ti ~ t2 iff sh(fi) — sh{t2). Then ~ is the smallest 
equivalence relation p such that 

1. aPh\ 

2. (siPtl) A{s2Pt2) ^ g(si,S2)Pg(tl,t2). 

For every term t define the word tCont(f) £ {0, 1}* by letting 

tCont(a) = 
tCont(6) = 1 
tCont(/(ti,t2)) = tCont(ii) • tCont(t2) 

The set of all words w £ {0, 1}" is isomorphic the boolean 
algebra of B„ of all subsets of some finite sets of cardinality 
n, so we write winw2, W1UW2, w'^ for operations correspond- 
ing to intersection, union, and set complement in the set of 
words w € {0, 1}". We write wi C W2 for wi n W2 = wi. 
Define function 5 by 

5{t) = (sh(t),tCont(t)) 
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For term t in any language containing constant symbols, let 
tLen(t) denote the number of occurrences of constant sym- 
bols in t. If w is a sequence of elements of some set, let 
sLen(M;) denote the length of the sequence. Observe that 
sLen(tCont(t)) = tLen(t) and tLen(sh(t)) = tLen(f). More- 
over, ti ~ t2 implies sLen(tCont(ti)) = sLen(tCont(t2))- De- 
fine the set B by 

B = {{s,w} I s e FT(So),w € {0, l}*,tLen(s) = sLen(w)} 

Function 5 is a bijection from the set FT(E) to the set B. 
For 61, &2 € B define 61 < 62 iff <5"^(6i) < <5"^(62). From 
the definitions it follows 

(si,Wl) < (S2,W2) <=^' Si = S2 A Wl C W2 

If g is defined on B via isomorphism 5 we also have 

g{(si,wi), {s2,W2)) = {g^{si,s2),wi ■ W2) 

For any fixed s € FT(So), the set 

B{so) = {{s,w) € B \ s = so} (20) 

is isomorphic to the boolean algebra B„, where n = tLen(s). 
Accordingly, we introduce on each B{s) the set operations 
ti Cls t2, ti Us t2, tis- Expressions ti Os t2 and ti Us t2 are 
defined iff sh(ti) — s and sh(t2) = s, whereas expression t^s 
is defined iff sh(ti) = s. 

We also introduce cardinality expressions as in Sec- 
tion 3.2. If t denotes a term, then the expression \t\s de- 
notes the number of elements of the set corresponding to t. 
Here we require s = sh{t). We use expressions |t|s = k and 
\t\s > k as atomic formulas for constant integer k >0. Note 
that 

ti<t2 <S=^ sh(ti) = sh(t2) A |ii nt^lshcti) = (21) 
ti=t2 <S=^ sh(fi) = sh(f2) A 

(22) 

|(tinti)u(t? nt2)lsh(ti) = 

Let sh(ti) = si, sh{t2) — S2, and s — g^{si,S2)- Then 

\g{ti,t2)\s^\ti\s, + \t2\s2 (23) 

Equation 23 allows decomposing formulas of form 
15(^1, ^2)|s > k into propositional combinations of formulas 
of form |ii|si ^ 1^2 |s2 > k. 

Note further that the following equations hold: 

g{ti,t2)ng{t'ut'2) = gitint'ut2nt'2) 
g{ti,t2}Ug{t[,t'2) = ff(tiUt'i,t2U4) 
g{ti,t2r = g{t\,tl) 

If E{xi, .... x„) denotes an expression consisting only of op- 
erations of boolean algebra, then from (4.1) by induction 
follows that 

E{g{tl,tl), g{ti,tl)) = g{E{tl, tl,),E{tl tl)) 

(24) 

Equations (24) and (23) imply 

\E{g{tl,tl), g{tltl))\ = ml, ...,ti)\ + \E{tl ...,tl)\ 

(25) 

Boolean algebra B{g^{si,S2)) is isomorphic to the product 
of boolean algebras B{si) and B(s2); the constructor g acts 
as union of disjoint sets. 
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: shape term 


-|. = fe : 


shape x term —^ bool 



Figure 5: Operations and relations in structure FT2 
4.2 A Multisorted Logic 

To show the decidability of structure BS, we give a quantifier 
elimination procedure for an extended structure, denoted 
FT2. We use a first-order two-sorted logic with sorts term 
and shape interpreted over FT2. 

The domain of structure FT2 is FT(E) U FT(Eo) with el- 
ements FT(S) having sort term and elements FT(Eo) having 
sort shape. Variables in Var have term sort, variables in Var^ 
have shape sort. In general, if t denotes an element of FT2, 
we write to indicate that the clement has sort shape. 

Figure 5 shows operations and relations in FT2 with their 
sort declarations. The signature is infinite because opera- 
tions \t\s > k and \t\s — k are parameterized by a non- 
negative integer k. 

We require all terms to be well-sorted. Functions gi and 
g2 are interpreted as partial selector functions in the term 
constructor-selector language, so Dg^ = Dg^ = {{x), ISg(a;)). 
Similarly, g\ and gl are partial selector functions in the 
shape constructor-selector language, so = Dg| = 

{{x), lsgs(a;)). The expressions tinst2 and tiUst2 are defined 
iff sh(ti) = sh{t2) = s, and is defined iff sh(t) = s. We 
therefore let 

Dn, = Du, = 

{{y\xi,X2},sh{xi) =y^ Ash{x2) = y^} 

and 

D_c = {{y\x),sh{x) = J/') 

For atomic formulas \t\s > k and |t|s = fe we require atomic 
formula sh{i) = s to ensure well-definedness: 

D\_\_=k = -D|_|_>fc = {{y%x),sh{x) = 

Note that the language of Figure 5 subsumes the lan- 
guage {a, b,g,<} for the structural subtyping structure. The 
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quantifier-elimination procedure wo present in Section 4.3 
is therefore sufficient for quantifier efimination in the first- 
order logic interpreted over the structural subtyping struc- 
ture FT2. 

4.3 Quantifier Elimination for Two Constants 

We are now ready to present a quantifier elimination pro- 
cedure for the structure FT2. The quantifier elimination 
procedure is based on the quantifier elimination for term al- 
gebras of Section 3.4 as well as the quantifier elimination for 

boolean algebras of Section 3.2. 

We first define an auxiliary notion of a w^-term as a term 
formed starting from shape term variables and shape 
constants, using operations n„s, U„s, and _Ss. 

Definition 40 (u^-terms) Let £ Var^ be a shape vari- 
able. The set of -terms Term(M^) is the least set such that: 

1. Var C Term(M') 

2. 0„s, l„s e Term(u^) 

3. ift,t' £ Term(M'^), then also 

t n„s t' G Term(M'), 

t Uus t' € Term(M'), and 

tZ> £ Term(M^) 

Similarly to base formulas of Section 3.4, we define struc- 
tural base formulas for FT2 structure. A structural base for- 
mula contains a copy of a base formula for the shape sort 
(shapeBase), a base formula for the term sort without term 
discqualitics (termBase), a formula expressing mapping of 
term variables to shape variables (hom), and cardinality con- 
straints on term parameter nodes of the term base formula 
(cardin). 

Definition 41 (Structural Base Formula) 

A structural base formula with: 

• free term variables xi, . . . , Xm,; 

• internal non-parameter term variables ui, . . . ,Up; 

• internal parameter term variables Up+i, . . . , Up+q; 

• free shape variables x\, . . . , x^s ; 

• internal non-parameter shape variables u\, . . . , Ups; 

• internal parameter shape variables m^s, . . . , Mps+^s 
is a formula of form: 

3wi, . . . 

shapeBase(wl, . . . , ul^s,x\, . . . , xl^s) A 

termBase(Mi, . . . , m„, xi, . . . , Xm) A 

hom(Mi, . ..,u„,u\,... A 

cardin(Mp+i, . . . ,w„,Mps+i, . . . 

where n — p q, = -\- q^, and formulas shapeBase, 
termBase, hom, and cardin are defined as follows. 

shapeBase(wi, . . . , u^„s,x\, . . . , x^s) = 

A ul = ti{ul,..., Un) A f\ xl= -uj, 

1=1 i=l 

A distinct(tt|, . . . , Un) 



where each ti is a shape term of form f{ul^, . . . ,ul^) for 
some f £ So, k = ar(/), and j : {1, . . . ,m^} ^ {1, . . . ,n^} is 
a function mapp ing indices of free shape variables to indices 
of internal shape variables. 

termBase(Mi, . . . ,u„,xi,. . . ,Xm) = 

p m 

/\ Ui = ti{ui, u„) A l\ Xi = Uj^ 

i=l i=l 

where each ti is a term of form f{ui^ , ■ ■ ■ , Ui^. ) for some 
f € T:, k = ar(/), and j : {1, . . . , m} — > {1, . . . , n} is a 
function mapping indices of free term variables to indices of 
internal term variables. 

71 

hom(Mi, . ..,«„, Ml, . . . , w^O = A sh(wf) = u]. 

i = l 

where j : {1, . . . , n} {1, . . . , n''} is some function such 
that {ji,. . .,jp} C {1, . . . ,p^} and {jp+i, . . .,jp-t-q} C {p' -|- 
1, . . . ,p^ -\-q^} (a term variable is a parameter variable iff its 
shape is a parameter shape variable). 

cardin(wp+i, . . . , Up+q, Wps+i, . . . , Mps+,s) = Vi A • • • A Vr- 

where each ipi is of form 

\t{Up+l,. . . ,Up+q)\u^ = k 

or 

\t{Up+l,. . . ,Up+q)\u^ > k 

for some u^-term t{up+i, . . . ,Up+q) that contains no vari- 
ables other than some of the variables Up+i, . . . ,Up+q, and 
the following condition holds: 

If a variable Up+j occurs in term 
t{up+i,...,Up+q), then sh(up+j) = u" (26) 
occurs in formula hom. 

We require each structural base formula to satisfy the 

following conditions: 

PO) the graph associated with shape base formula 

3mi, . . . , u„s . shapeBase(ui, . . . , u^s, Xi, . . . , a:^s) 
is acyclic (compare to Definition 21); 

PI) congruence closure property for shapeBase subformula: 
there are no two distinct variables ul and u^j such that 
both M • = /(ti'j , . . . , -Uij^ ) and Uj = f{u\.^ ,...,u\J occur 
as conjuncts in formula shapeBase; 

P2) congruence closure property for termBase subformula: 
there are no two distinct variables Ui and Uj such that 
both Ui = /(wii ,...,ui^) and Uj = f{ui^, . . . ,ui^) occur 
as conjuncts in formula termBase; 

PS) homomorphism property o/sh; for every non-parameter 
term variable u such that u = f{ui^, . . . ,Uif,) occurs 
in termBase, if the conjunct sh{u) = u'^ occurs in 
hom, then for some shape variables u^j^, . . . ,u^j^ the 
term = /^(uj^, . . . jitj^.) occurs in shapeBase where 
f^ = shapified(/) and for every r where 1 < r < fc, 
conjunct sh(Mi^) = Uj^ occurs in hom. 
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Figure 6: One of the Base Formulas Resulting from (28) 

According to Definition 41 a structural base formula con- 
tains no selector function symbols. Formulation using se- 
lector symbols is also possible, as in Definition 20. The 
only partial function symbols occurring in a structural base 
formula of Definition 41 are in card in subfornmla. Condi- 
tion (26) therefore ensures that functions in cardin and thus 
the entire base formula are well-defined. 

Note that acyclicity of shape base formula shapeBase 
(condition PO) implies acyclicity of term base formula as 
well. Namely, condition P3 ensures that any cycle in 
term Base implies a cycle in shapeBase. 

As in Section 3.4 we proceed to show that each quantifier- 
free formula can be written as a disjunction of base formulas 
and each base formula can be written as a quantifier-free 
formula. 

We strongly encourage the reader to study the following 
example because it illustrates the idea behind our quantifier- 
elimination decision procedure. 

Example 42 The following sentence is true in structure 
FTa. 

Vx, y. X < y => 

3z. z<xAz<yA 

Vw. w < X Aw < y 

\/v. g{v,z) < g{z,v) A \Sg(v) A Isg(w) gi{w) < gi{v) 

(27) 

An informal proof of sentence (27) is as follows. Suppose 
that X < y. Then sh(a;) = sh(j/) = x^. Let z = x Ox' y. 
Now consider some w such that w < x and w < y. Then 
sh(u') = x'^, so w < z. Suppose that v is such that g{v, z) < 
g{z, v). Then by covariancc of g we have z < v, so w < v. If 
we assume lsg('u;) and Isg(u), then gi{'w) and gi{v) are well 
defined and by covariancc of g we conclude gi{w) < gi(v), 
as desired. 

We now give an alternative argument that shows that 
sentence (27) is true. This alternative argument illustrates 
the idea behind our quantifier-elimination decision proce- 
dure. For the sake of brevity we perform some additional 

simplifications along the way that arc not part of the pro- 
cedure we present (although they could bo incorporated to 
improve efficiency), and we skip consideration of some un- 
interesting cases during the case analyses. 

Let us first eliminate the quantifier from formula 

Vv. g{v,z) < g{z,v) A\Sg{v) A\Sg{w) gi(w) < gi{v) (28) 

Formula (28) is equivalent to -i3w.0i where 

01 = g{v, z) < g{z, v) A Isg (v) A Isg (w) A -.(gi (w) < gi (v) ) 

(29) 



We next use (21) to eliminate atomic formulas ti < 1-2 and 
replace them with cardinality constraints, resulting in for- 
mula 4>2 equivalent to 0i: 

02 = 02,1 A 02,2 

where 

02,1 = \g{v, z) n g{z, v)%y,^g^^^^)) = A 

sh{g{v,z))=sh{g{z,v)) A (30) 
^^g{v) r\\Sg(w) 

and 

02,2 = 

-n {\gi{w) n 5i(w)1sh(9i(«,)) = A sh(sri(w)) = sh(sri(t;))) 

(31) 

Here we have written e.g. 

\g{v, z) n g{z, ■y)''|sh(g(^,^)) = 
as a shorthand for 

\g{v,z) f\sh(g(v,z)) g{z,v)th(g(v,z))\sHa(v,z)) = 

(In general, we omit term shape arguments for boolean alge- 
bra operations if the arguments are identical to the enclosing 
term shape argument of the cardinality constraint.) 

We next transform 4)2 into disjunction of well-defined 
conjunctions. Following the ideas in Proposition 8, we trans- 
form (^2,2 into 03,1 V 03,2 where 

03,1 = 

\gi{w) r\ gx(v)%h(g-,(^)) > 1 A sh(5i(w)) = s'r\{g-i_{y)) 

(32) 

and 

03,2 = sh(5i(w)) sh(ffi(t;)) 
and then transform 02, i A 02,2 into 

(02,1 A 03, l) V (02,1 A 03,2) 

For the sake of brevity we ignore the case 4)2. i A (?!)3,2; it is 
possible to show that 02, i A 03,2 is equivalent to false in the 
context of the entire formula. 

We transform 02,i A03,i into unnostod form, introducing 
fresh existentially quantified variables 'Ui,z,U3„,Mu,i,m„i, u^^, 
w^i that denote terms occurring in 02, i A 03, i. The result 
is formula 04 where 

Uvz = g{v, z) A Uzv = g{z, v) A 
u^ul = gi{w) A Uvi = gi{v) A 
ul^ = sh(M^^) Am^i = sh(MTOi) A 
s\r\{uzv) — ulz A sh(M„i) = M^i A 
\Sg{v) A \sg{w) A 

\uvz nu%^\ui^ = A \uwi nttSi|„s ^ > 1 

(33) 

To transform 04 into disjunction of structural base formulas 
we keep introducing new existentially quantified variables 
and adding derived conjuncts to satisfy the invariants of 
Definition 41. 
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Because lsg(t;) and Is3(«j) appear in the conjunct, we 
give names to the remaining successors of v, w, by intro- 
ducing Un,2 ~ 52 (w), Uy2 ~ 9'2{y)- We may now write 
the constraints in constructor language, using e.g. conjunct 
V = g{uvi,Uv2) instead of 

\Sg{v) A Uvi = gi{v) A Uv2 = g2{v) 

To ensure that every term variable has an associated shape 
variable, we introduce fresh variables w^, u%, ul, w^2; 1^2 
with conjuncts ul = sh(w), = sh(w), u\ = sh(2), = 
sh{u-uj2), UI2 = sh(w^2). 

Note that base formula contains distinct(Mi, . . . ,u^„) sub- 
formula. In the case when the current conjunction is not 
strong enough to entail the disequality between shape vari- 
ables u] and Wj, we perform case analysis, considering the 
case ul = (then u] can be replaced by ttj), and the case 
ul 7^ uj . This case analysis will lead to a disjunction of struc- 
tural base formulas (unless some of the formulas is shown 
contradictory in the transformation process). In contrast to 
shape variables, we do not not perform case analysis for dis- 
equality of term variables, because term Base in Definition 41 
does not contain a distinct subformula. 

In this example we perform case analysis on whether 
Uw = ul; and = ul should hold. For the sake of the exam- 
ple let us consider the case when it^ = «| = = 
and it?,z, ftLi 'uLi ) w^2 B.rc all distinct. In that case shape 
variables u^,Mz,Mt denote the same shape, so let us replace 
e.g. it| and ul with u^. Similarly, we replace UI2 with u^2- 
We obtain conjuncts sh{v) = w^, sh{z) = ti^, sh(-u„2) = 'U^2- 

We next ensure homomorphism property P3 in Defini- 
tion 41. From conjuncts Uvz = g(v,z), sh{uvz) = ul^, and 
sh{v) = ul,, we conclude 

ulz = sU{Uvz) = 

sh{g{v,z)) = 
g\sh{v),sh{z)) = 

so we add the conjunct ul^ ~ g^iuljjulj) to the formula. 
Similarly, from w = g{un,i,Uw2), sh(w) = ul,, sU{uwi) = 
ul,i, sh{un,2) ~ ulu2 we conclude ul, — (7(u™i,Uto2) and add 
this conjunct to the formula. Adding these two conjuncts 
makes property P3 hold. (Note that, had we decided to 
consider the case where sh{v) ^ sh(z) we would have arrived 
at a contradiction due to sh(tt^^) = sh(ttj^).) 

We next apply rule (25) to reduce all cardinality con- 
straints into cardinality constraints on parameter nodes 
(nodes u for which there there is no conjunct of form 
u = f{ui^,. . . ,Ui^)). We replace \uvz n uly\u%^ = with 

\uv<r\Uz\ui,=QA\uzr\ul\ui,=Q (34) 

Variable t; is a parameter variable, but z is not, which pre- 
vents application of (25). We therefore introduce Uzi and 
Uz2 such that z — g{uz\,Uz2)- Because sh{z) — ul,, we have 
sh(uzi) = ului and sh(uz2) = ul,2 by homomorphism prop- 
erty. We can now continue applying rule (25) to (34). The 
result is: 

\uvi n Uzi\ui,i = A \uzi n Uyilu^^ = A 

\Uv2 n <2Ui,2 = A \Uz2 n <2U^,2 = 

To make the formula conform to Definition 41 we introduce 
internal variables Uv,Uz,Uw corresponding to free variables 



V, z, w, respectively. The resulting structural base formula 

is 

JUvz,Uzv,Uv,Uz,Un,,Uyl,Uv2,Uzl,Uz2, Mml, Mm2, 
'^vz^'^w^'^wl^'^'w2- 

(35) 

shapeBasej AtermBasei A 
homi A cardini 

where 

shapeBasei = ulz = g^{ul,,u%) Aul, = g^{ul,i,u%2) A 
distinct«^, w^, M^i, M^2) 
termBasei = Uvz = g{uv,Uz) A Uzv = g{uz, Uv) A 

Uv = g{Uvl,Uv2) AUz = g{Uzl, Uz2) A 

Uw = S(Wuil, Ww2) A 

V = Uv A Z = Uz A W = Uw 

homi = 

sU{uvz) = ulz A sh{uzv) = ulz A 

sh(M„) = ul, A sh{uz) = ul, A sh{uw) = ul, A 

sh(M„i) = M^i A sh(wji) = w^i A sh(M^i) = m^j A 

Sh(M„2) = Ul,2 A Sh{Uz2) = Ul,2 A Sh(M^2) = Ul,2 

cardini = |m^i nwji|us^^ = A \uzi OmJiIus^^ = A 

\Uv2 n Uz2\ul,2 = A \Uz2 n <2U=„2 = A 

\uwi n<i|us^^ > 1 

Figure 6 shows a graph representation of the subformulas 
shapeBase^, termBasei, and homi of the resulting structural 
base formula. 

Recall that we are eliminating the quantification over v 
from -i3v.(j)i- We can now existentially quantify over v. As 
in Proposition 27, we simply remove the conjunct v — m„ 
from term Base and the quantifier 3v. 

As in Figure 4 of Section 3.4 the structural base for- 
mula form allows us to eliminate an existential quantifier, 
whereas the quantifier-free form allows us to eliminate a 
negation. We transform the structural base formula (35) 
into a quantifier-free formula as follows. 

We first use rule (7) to eliminate variable Uvz, replacing 
it with g{v,z). In the resulting formula g{v,z) occurs only 
in homi in the form 

sU(g{uv,Uz)) = ulz (36) 

But (36) is a consequence of conjuncts ulz ~ g^i^wjUl))^ 
sh('u^) — ul, and sh{uy,) = ul,, so we omit (36) from the 
formula. In analogous way we eliminate variable Uzv and 
the conjuncts that contain it. We also eliminate Uv, anal- 
ogously to Uvz and Uzv In the resulting formula ulz oc- 
curs only in distinct subformula of shapeBase. Conjuncts 
ulz 7^ Uw, ulz 7^ ul,i, and ulz 7^ ^^2 follow from the re- 
maining conjuncts in shapeBase by acyclicity. Hence we may 
replace d\st\nct{ulz,ul,,ul,i,ul,2) by distinct(u^, it^,i , ii^.2). 
Now ulz does not occur in the matrix of the formula, so we 
may eliminate 3ulz altogether. 
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The resulting formula is: 



variables are "covered"): 



!>5 = JUz,Uw,Uvl,Uv2,Uzl,Uz2,Uu,l,Uw2,ul,,u\^l,ul;2- 


Uz 


= z 




= «; 


= 5'(w^i,M^2) A distinct(M?„,w?„i,M^2) A 


Uzl 


= 9i{z) 




= 52(2) 


Uz = g{Uzl,Uz2) A = g{u^i,u^2) /\ 




= gi{w) 




= 52(w) 


z = Uz A w = Uu, A 




= sh(ui) 






sh(ii2) = A sh('Uu,) = 'u^ A 




= .g!(shH) 


y-w2 





sh(«„i) = w^i A sh{uzi) = M^i A sh(M„i) = m^i A 
sh(w„2) = WL2 A sh(uz2) = «L2 A sh(w^2) = uIj2 a 
|w„i n <i = A |w^i n wji |„s^^ = A 
\uv2 n u%2\u%,^ = A |w^2 n W52|<2 = A 
|w<i;i n wSiU^,! > 1 

(37) 

Wc next eliminate 'u„i. It suffices to eliminate it from con- 
juncts where it occurs, so we consider formula 05, 1: 

05,1 = 3Ut,l. 

sh(w„i) = u%,i A sh(wzi) = «Li A sh(M™i) = u%,x A 

nM^i|„=^^ = A HmSiI^s^^ = A 
|mu,i nwSi|„5 ^ > 1 

(38) 

Note that all variables from 05, 1 belong to B{s) where s 
is the value of shape variable u^i (see (20)). This means 
that we can apply quantifier elimination for boolean algebra 
(Section 3.2) to eliminate w„i. The result is 

05,2 = sh{uzi) = wLi A sh(w^i) = u^i A (39) 

|W«;1 n<i|„s^^ > 1 

Similarly, to eliminate w„2 we consider formula 06,3: 

06,3 S 3Uv2- 

sh(w„2) = u%;2 A sh(w^2) = u\j2 a sh(w,„2) = u%j2 a 

\Uv2 n U%2\u\,2 = A \Uz2 n Ul2\u%,^ = 

(40) 

The result of boolean algebra quantifier elimination on 05,3 
is true (indeed, one may let Uv2 = W22). The resulting base 
formula with w^i and w„2 eliminated is 06 : 

06 = ^Uz 

uIj = g^{ulji,u%2) A distinct(M^,M^i,M^2) A 

Uz = g{Uzl,Uz2) A Uw = giUwl,Uyj2) A 

Z = Uz /\W = Uw A 

sh{uz) — uIj A sh{uw) = A 
sh{uzi) = M^i A sh('u„i) = A 

Sh(u22) = tt^2 A sh(Mu,2) = ul,2 A 

\uwi n > 1 

Observe that the equalities in 06 are sufficient to express all 
variables bound in 06 in terms of free variables (all internal 
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Structural base formula 06 is therefore equivalent to the 
quantifier- free formula 07, 1: 

07,1 = \Sg3{sh{w)) A\Sg{w) A\Sg{z) A 

distinct(gi(sh(u!)), .g|(sh(w))) 

sh{z) = sh(u)) A \gi{w) n gi{zy\gi(^h(w)) > 1 

(43) 

When transforming formula 04 we chose the case u^i 7^ uLi- 
If we choose the case w^i = u%2, we obtain quantifier- free 
formula 07,2: 

07,2 = lsgs(sh(w)) A Isg(w) A Isg(a) A 

sU{z) = sh(u;) A 5!(sh(w)) = ff|(sh(w)) A 
\gi{w)ngi{zy\g^^(^')) > 1 

(44) 

Our quantifier elimination would also consider the case 
sh((/2(™)) 7^ sh((/2(z)). The procedure finds the case con- 
tradictory in a larger context, when eliminating 3z, because 
sh(2;) = sh(x) = sh('(i;) follows from z < x and w < x. Ig- 
noring this case, we observe that 07, 1 V 07,2 is equivalent to 
the quantifier- free formula 08, where 

08 = ISgs(sh(w)) A Is9(to) A ISg(2:) A 

sh(2:) = sh(u') A |5i(w) ngi(z)''|gs(sh(i„)) > 1 

(45) 

Let us therefore assume that the result of quantifier elimi- 
nation in (28) is -i08. 

We proceed to eliminate the next quantifier, Vw, from 

Vw. w < X Aw < y => -108 (46) 

(46) is equivalent to 

-i3w. w<xAw<yA4>8 

After eliminating < we obtain 

-i3w. \w n x'^\sh(w) = A sh(a;) = sh(«;) A 

\w n jy1sh{») = A sh{y) = sh(w) A 

Is5s(sh(t(;)) A \Sg{w) A 159(2:) A 

sh(z)=sh(w) A |.(;i(i«) ngi(2;)^|gs(sh(„)) > 1 

(47) 

We now proceed similarly as in eliminating variable v. The 
result is -i09 where 

09 = sh(x)=sh(«) A sh{y)=sh(z)A 

\Sg{x) A \Sg{y) A \Sg{z) A lsg.(sh(0)) A (48) 

\gi{x) n gi{y) n gi{zy\g^^h(z)) > 1 

The remaining quantifiers that bind z, y, and x are elimi- 
nated similarly. 
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To eliminate the quantifier 3z, we need to transform -^(f><j 
into disjunction of base formulas. This transformation re- 
quires negation of ijig and creates several disjuncts. We con- 
sider only the two cases, (j)io and 4>ii, that are not contradic- 
tory in the enclosing context of conjuncts z < x and z <y: 

010 = sh{x) = sh{z) A sh{y) = sh{z) A -ilSgs(sh(«)) 

(49) 

011 = sh(a;) = sh(2;) A sh{y) = sh{z) A 

\Sg{x)A\s, (y) A Isg {z) A Is^. (sh («) ) A (50) 
\gi{x) n Si(j/) n5i(0)=|g.(sh(;^)) = 
010 is equivalent to 

sh(a:) = c'Ash(j/) =c=Ash(0) =c' (51) 

The result of eliminating 3z from 

3z. |2na;'=|sh(;.) = A 1^ n j/^lshw = A 0io 

is therefore 

010,2 = sh(x)=sh(j/) A -.lsgs(sh(x)) (52) 
The result of eliminating 3z from 

3z. lanx'Uw^O A l2n2/"U(,)=0 A 0ii 

is 

011,2 = sh(x)=sh(j/) A lsgs(sh(a;)) 
010,2 V 011,2 is equivalent to sh{x) = sh{y). Converting 

\x n /|sh(a:) = A sh(x) = sh(j/) ^ sh(x) = sh{y) 

to structural base formula yields true. Wo conclude that (27) 
is a true sentence in the structure FT2, which completes our 
quantifier elimination procedure example. 

♦ 

Formulas in the Example 42 do not contain disequalities be- 
tween terms variables, only disequalities between shape vari- 
ables. If a conjunction contains disequalities between term 
variables, we eliminate the disequalities using rule (22) in 
the process of converting formula to disjunction of struc- 
tural base formulas. The following Example 43 illustrates 
this process. 

Example 43 Consider the formula 

06 = 06 A Uz / U„ 

Where 06 is given by (41). By (22), literal Uz 7^ is 
equivalent to ipi V '02 where 

^1 = sh(M^) ^ sh(M^) (53) 

and 

■02 = sMUz) = Sh(Uw) /\ 

(54) 

|(m^ nwS,) u (wj nM^)|sh(u,) > 1 



unnested form | cardinality constraint 
x = xiDs X2 \x + (xi n a;2)|s = 

X = XlUs X2 \x + {Xl U X2)\s = 
X = Xi^ \x + xi\s = 

Figure 7: Elimination of Boolean Algebra Unnested Formu- 
las . Expression x + y is a, shorthand for (a; fl y") U {yO x°). 



In this case, formula (pe^i'i is contradictory. Formula 06AV'2 
is equivalent to 06 where 

06 = 3Mz 

= g'(M^i,M^2) A distinct(M^,w^i,w^2) A 

Uz = g{Uzl,Uz2) A Uyj = g{Uwl,Uw2) A 

z = Uz Aw = Uw A 

sh(itz) — ul^ A s\\{un,) = u%, A (55) 
sh(Mzi) = w^i A sh('u»i) = u'^,1 A 
sh(w^2) = U%,2 A sh(Mu,2) = u\^2 A 

\uwi nwji|„s^^ > 1 A 

\{uz n ul,) u {u% n u^u)\u%, > 1 

As in Example 42, we now apply rule (25) to 

\{uz n uIj) u {ul n Uu,)\ui, > 1 

and transform 06 into a disjunction of base formulas. 
♦ 

We proceed to sketch the general case of quantifier elimina- 
tion. The following Proposition 44 is analogous to Proposi- 
tion 27; the proof is again straightforward. 

Proposition 44 (Quantification of Structural Base) 

If (3 is a structural base formula and x a free term vari- 
able in (3, then there exists a base structural formula (3i 
equivalent to 3x.f3. 

The following Proposition 45 corresponds to Proposition 28. 

Proposition 45 (Quantifier-Free to Structural Base) 

Every well-defined quantifier-free formula in the language 
of Figure 5 can be written as true, false, or a disjunction of 

structural base formulas. 

Proof Sketch. Let be a well-defined quantifier-free 
formula in the language of Figure 5. 

We first use rule (21) to eliminate occurrences of < in 
the formula replacing them with cardinality constraints. 

We then convert formula into disjunction 0i V ■ ■ • V <pn of 
well-formed conjunctions of literals. We next describe how 
to transform each conjunction 0i into a disjunction of base 
formulas. 

Let 0i be a conjunction of literals. Using the technique 
of Proposition 9, we convert the formula to unnested form, 
adding existential quantifiers. We then eliminate unnested 
conjuncts that contain boolean algebra operations, accord- 
ing to Figure 7. The only atomic formulas in the resulting 
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existentially quantified conjunction are of form x — a, x — b, 

X = g{xi,X2), ISg(x), Xl — gi{x), X2 = g2{x), Xl — X2, 

x' = (f, x' = g'{x\,xl), \sgs{x'), x\ = ff!(a;"), a;| = gii^"), 
x\ = .x-|, i'^ = sh(a;), as well as |fi|a;s > fc and \t2\x^ ~ k for 
some .x-'-tcrms ti and f2- The only negated atomic formulas 
are of form xi ^ X2, x\ ^ x%, -ilsg(a;) and -ilSgs(a;^). As in 
the proof of Proposition 28, we use (17) to eliminate -ilSg(a;) 
and -ils£,s(x^). This process leaves formulas of form xi ^ X2 
and Xl 7^ X2 as the only negated atomic formulas. 

In the sequel, whenever we perform case analysis and 
generate a disjunction of conjunctions, existential quantifiers 
propagate to the conjunctions, so wc keep working with a 
existentially quantified conjunction. The existentially quan- 
tified variables will become internal variables of a structural 
base formula. 

We next convert conjuncts that contain only term vari- 
ables to a base formula, and convert shape paxt to base 
formula, as in the proof of Proposition 28. We simultane- 
ously make sure every term variable has an associated shape 
variable, introducing new shape variables if needed. (This 
process is interleaved with conversion to base formula, to en- 
sure that there is always a conjunct stating that newly intro- 
duced shape variables are distinct.) We also ensure homo- 
morphism requirement by replacing internal variables when 
we entail their equality. Another condition we ensure is that 
parameter term variables map to parameter shape variables, 
and non-parameter term variables to non-parameter shape 
variables; we do this by performing expansion of term and 
shape variables. We perform expansion of shape variables as 
in Section 3.2. Expansion of term variables is even simpler 
because there is no need to do case analysis on equality of 
term variable with other variables. 

The resulting existentially quantified conjunction might 
contain disequalities u =^ u' between term variables. We 
eliminate these disequalities as explained in Example 43, 
by converting each disequality into a cardinality constraint 
using (22). In general, wc need to consider the case when 
sh(M) 7^ sh(u') and generate another disjunct. 

Elimination of discqualities might violate previously es- 
tablished homomorphism invariants, so we may need to 
reestablish these invariants by repeating the previously de- 
scribed steps. The overall process terminates because we 
never introduce new inequalities between term variables. 

As a final step, we convert all cardinality constraints into 
constraints on parameter term variables, using (25). In the 
case when the shape of cardinality constraint is c^, we can- 
not apply (25). However, in that case the homomorphism 
condition ensures that each of the participating variables is 
equal to a or equal to b. This means that we can simply 
evaluate the cardinality constraint in the boolean algebra 
{a,b}. If the result is true we simply drop the constraint, 
otherwise the entire base formula becomes false. 

This completes our sketch of transforming a quantifier- 
free formula into disjunction of structural base formulas. ■ 

We introduce the notion of covered variables in structural 
base formula by generalizing Definition 29. 

Definition 46 The set covering of variable coverings of a 
structural base formula (3 is the least set S of pairs {u, t) 
where u is an internal (shape or term) variable and t is a 
term over the free variables of (3, such such that: 

1. if X = u occurs in term Base then {u,x) € S; 

2. if x^ = occurs in shapeBase then {u^,x^} € S; 



3. if {u, t) £ S and u = f{u\, . . . , Uk) occurs in termBase 
for some / £ S then {{u\, fi{t)) , . . . , {uk,fk{t))} C S; 

4- if {u^,f} £ S and u^ = f^{u\, . . . ,ul.) occurs in 
ShapeBase then {{u{,fl{f)), {ul, fHf))} C 5; 

5. if {u,t) G S and sh(w) = u^ occurs in hom then 

(M^sh(^)) e s. 

Definition 47 An internal term variable u is covered iff 

there exists a term t such that {u, t) G S . An internal shape 
variable u^ is covered iff there exists a term f such that 
(u\f) € S. 

Lemma 48 Let be a structural base formula with matrix 
Po and let covering be the covering of (3. 

1. If {u, t) e S then |= /3o => m = t. 

2. If {u% € S then \= /So ^ u' = t\ 
Proof. By induction, using Definition 46. ■ 

Corollary 49 Let j3 be a structural base formula such that 
every internal variable is covered. Then (3 is equivalent to a 
well-defined quantifier-free formula. 

Proof. By Lemma 48 using (7). ■ 

Lemma 50 Let u be an uncovered non-parameter term 
variable in a structural base formula (3 such that u is a source 
i.e. no conjunct of form 

U = f{ui,. ..,U,...,Uk) 

occurs in termBase. Let (3' be the result of dropping u from 
(3. Then (3 is equivalent to (3' . 

Proof. Let u occur in termBase in form 

W = /(mIi ■■■,Uk) 

The only other occurrence of w in /3 is in hom and has the 
form sh(M) = u^ . Because non-parameter term variables 
are mapped to non-parameter shape variables, shapeBase 
contains formula 

u'^ = shapified(/)(Ml, . . . , «fc) (56) 

where u\, . . . ,u% are such that, by homomorphism property, 
sh(ui) = u\ occurs in hom. This means that the conjunct 
sh(w) = is a consequence of the remaining conjuncts, so it 
may be omitted. After that, applying (7) yields a structural 
base formula f3' not containing u, where f3' is equivalent to 
p. m 

Corollary 51 Every base formula is equivalent to a base 
formula without uncovered non-parameter term variables. 

Proof. If a structural base formula has an uncovered 
non-parameter term variable, then it has an uncovered non- 
parameter term variable that is a source. By repeated ap- 
plication of Lemma (50) we eliminate all uncovered non- 
parameter term variables. ■ 

The next example illustrates how we deal with cardinal- 
ity constraints |ls|s > k and |ls| = k, which contain no 
term variables. These constraints restrict the size of shape 
s. Luckily, we can be translate them into shape base formula 
constraints. 
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Example 52 (Shape Term Size Constraints) 

Let X < y denote conjunction x < y/\x ^ y. Let us eliminate 
quantifiers from formula 3x.<f){x) where 

= -i(3j/.3«. X < y Ay < z) A (57) 
-i(3m. u < x) 

Eliminating variables y, z from the first conjunct and vari- 
able u from the second conjunct yields 

^Ix^Uix) > 2 A -'|x|sh(x) > 1 

which is equivalent to 

ilx^Uhix) = V lai^^lshCrc) = 1) A \x\^h(x) = 

and further to disjunction 

{\x%h(x) = A |.T|sh(,.) = 0) V (|a::''|sh(a:) = 1 A |.'/4h(,x) = 0) 

The first disjunct can be shown contradictory. Let us trans- 
form the second disjunct into a structural base formula. Af- 
ter introducing u = x and = sh (u) , we obtain 

3u,u^.x = u A sh(M)=M^ A |w|us = A Iw^'jus = 1 

Then 3x.(j)(x) is equivalent to 

3m, M^ sh(M) = A |m|us = A |w°|„s = 1 

Eliminating parameter term variable u yields 

3w=. |1|„. = 1 

Constraint Il|us = 1 means that the largest set in the 
boolean algebra B{s) where s is the value of has size 
one. There exists exactly one boolean algebra of size one 
in the structure FT2, namely {a,b}. Therefore, |l|us = 1 is 
equivalent to = c^ We may now eliminate by letting 

= d^. We conclude that the sentence 3x.(j){x) is true. 

Notice that we have also established that formula 4>{x) 
is equivalent to sh(x) = c^, as a consequence of 

|lsh(x)|sh(a:) = 1 

♦ 

The following Proposition 53 corresponds to Proposi- 
tion 38. 

Proposition 53 (Struct. Base to Quantifier-Free) 

Every structural base formula 13 is equivalent to a quantifier- 
free formula 4> m the language of Figure 5. 

Proof Sketch. By Corollary 51 we may assume that /? 
has no uncovered non-parameter term variables. By Corol- 
lary 49 we are done if there are no uncovered variables, so 
it suffices to eliminate uncovered parameter term variables 
and uncovered shape variables. 

Let u be an uncovered parameter term variable. Then u 
does not occur in termBase. Indeed, suppose for the sake of 
contradiction that u occurs in termBase in some formula 

U = /(til, . . . ,U,. . . ,Uk) 

Then u' is an uncovered non-parameter variable in /3, which 
is a contradiction because we have assumed (3 has no uncov- 
ered non-parameter variables. Therefore, u does no occur in 



termBase, it occurs only in horn and cardin. Let sh(u) = 
occur in horn. Let tpi, . . . be all conjuncts of cardin that 
contain u. Each tpi is of form \ti\u^ > ki or |fi|us = ki for 
some w'^-tcrm ti. Let uj-^ , . . . , Uj^ be all term variables ap- 
pearing in ti terms other than u. Conjunct sh(uj^) = u^ 
occurs in hom for each r where 1 < r < q. The base formula 
can therefore be written in form 

/3l = 3X1, . . . ,Xe,x\,. . . ,x}. (j) A4>1 

where 

(pi = 3u. sh(u)=u^A 

sh{uj^)=u^ A... A sh(wjJ=u=A (58) 
ipi A ... A Vp 

All term variables in ipi, . . . ,tpk range over terms of shape 
u^. Therefore, (f>i defines a relation in the boolean algebra 
-Bdii'^J). This allows us to apply construction in Section 3.2. 
We eliminate u from xpi A. . .A tpp and obtain a propositiorial 
combination V'o of cardinality constraints with M^-terms. (f>o 
does not contain variable u. We may assume that tpo is in 
disjunctive normal form 

ipo = ai V ... V aw 

Let 

01, i = sh{uji) = A... A sh{uj^) = A a, 

for 1 < i < w. Base formula (3i is equivalent to disjunction 
of base formulas where 

/3l,i = 3X1, . . . ,Xe,x\, . . . ,xy. (f) A (j>lA 

We have thus eliminated an uncovered parameter term vari- 
able u from f3i. By repeating this process we eliminate all 
uncovered parameter term variables from a base formula. 
The resulting formula contains no uncovered term variables. 

It remains to eliminate uncovered shape variables. This 
process is similar to term algebra quantifier elimination in 
Section 3.4. An essential part of construction in Section 3.4 
is Lemma 25, which relies on the fact that uncovered pa- 
rameter variables may take on infinitely many values. We 
therefore ensure that uncovered parameter shape variables 
are not constrained by term variables through conjuncts out- 
side shapeBase. 

Suppose that is an uncovered parameter shape vari- 
able in a base formula /3. u'^ does not occur in termBase. 

does not occur in hom either, because all term variables 
are covered, and a conjunct sh(w) = would imply that 
is covered. The only possible occurrence of is in cardi- 
nality constraint of subformula cardin, where is of form 
|f|„s = or of form |f|„s > k. Suppose there is some term 
variable u occurring in t. Then sh(it) — so is covered, 
which is a contradiction. Therefore, t has no variables, t 
can thus be simplified to either Ou^ or l„s. In general, a con- 
straint of form |l|us = A: or |l|us > A: is a domain cardinality 
constraint for boolean algebra S(|m*]) (see Remark 15 as 
well as (20)). A constraint containing \Qu' \ is equivalent to 
true or false. A constraint |l|„s = is equivalent to false. A 
constraint |1|ms = A; for fc > 1 is equivalent to 

= V • • • V = 4 

where f i , . . . , is the list of all ground terms in signature Eo 
that have exactly k occurrences of constant (f . We therefore 
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generate a disjunction of base formulas I3\, . . . ,l3p where ft 
results frorn [3 by replacing = k with = t\. We con- 
vert each (3i to a disjunction of base formulas by labelling 
subterms of ti by internal shape variables and doing case 
analysis on the equality between new internal shape vari- 
ables to ensure the invariants of a base formula. The result 
is a disjunction of base formulas where variable occurs 
only in shapeBase subformula. 

Similarly, |l|us > k + 1 is equivalent to -idllus = k) and 
thus to 

uV ii A • • • A mV t\ (59) 

where t\^. . . ,t% is the list of all ground terms in signature 

Eo that have at most k occurrences of constant (f. We 
replace |l|us > k + 1 by (59) and again convert the result 
to a disjunction of base formulas where occurs only in 
shapeBase subformula. 

Each of the resulting base formulas are such that every 
uncovered variable in (3^ is a shape variable that occurs only 
in shapeBase. Let 

= 3mi, ■ ■ ■ ,Wn,wi, • • • ,WpS,WpS_|_i, . . . ,MpS+gS. 

shapeBase(Mi, . . . , u%s,x\, . . . , x^s) A 
termBase(Mi, . . . , m„, xi, . . . , Xm) A 
hom(ui,...,w„,Mi,...,M^) A 
cardin(wp+i, . . . , Up+q, Wps+i, . . . , Wps+^s) 

where u\, . . . , «ps are uncovered shape variables. Then is 
equivalent to 

= 3ui, . . . ,-u„,Wps+i, . . . ,Mps+,s. 

(jy^ (MpS_|_i , . . . , Mps_|_qs ,x\,..., X^s ) 

termBase(Mi, . . . , m„, xi, . . . , Xm) A 
hom(Mi,...,M„,Mi,...,M^) A 

Cardin(Mp+l, . . . , Wp-l-q, WpS+l, . . . ,WpS_|_qs) 

Here 4>'^ is a base formula (Definitions 19 and 21) whose free 
variables are variables free in as well as all covered shape 
variables: 

•/"^(WpS + l, . . . , MpS+qS, Xi, . . . , X^s) = 3Mi,...,WpS. 

shapeBase(wi, . . . ,Wps,Wps+i,Wps+qs,Xi, . . . ,x^0 

Applying Lemma 37 we conclude that cjy^ is equivalent to 
some disjunction 

i=l 

of base formulas without uncovered variables. Let 13'^''' be the 
result of replacing 4>^ with 0^'' in Then 0^ is equivalent 
to 

k 

and each /3^'' has no uncovered variables either, because 
every free variable of <fP''^ is either free or covered in 
By Corollary 49 each 0^'^ can be written as a quantifier free 
formula. ■ 

The following Theorem 54 corresponds to 39 of Sec- 
tion 3.4. 



Theorem 54 (Two Constants Quant. Elimination) 

There exist algorithms A, B such that for a given formula 
<f) in the language of Figure 5: 

a) A produces a quantifier-free formula 4>' in selector lan- 
guage 

b) B produces a disjunction (f>' of structural base formulas 

Proof. Analogous to proof of Theorem 39, using Propo- 
sition 45 in place of Proposition 28 and Proposition 53 in 
place of Proposition 38. ■ 

Corollary 55 The first-order theory of the structure FT2 is 

decidable. 

This completes description of our quantifier elimination 
for the first-order theory of structure FT2, which models 
structural subtyping with two base types and one binary 
constructor. It is straightforward to extend the construc- 
tion of this section to any number of covariant constructors 
if the base formula has only two constants. In Section 5 we 
extend the result to any number of constants as well. Fi- 
nally, in Section 6 we extend the result to allow arbitrary 
decidable structures for primitive types, even if the number 
of primitive types is infinite. 

5 A Finite Number of Constants 

In this section we prove the decidability of structural sub- 
typing of any finite number of constant symbols (primitive 
types) and any number of function symbols (constructors). 
We first show the result when all constructors are covariant, 
we then show the result when some of the constructors are 
contravariant. 

We introduce the notion of S-term-power of some struc- 
ture C as a generalization of the structure of structural sub- 
typing. 

We represent primitive types in structural subtyping as 
a structure C with a finite carrier C. We call C the base 

structure. Without loss of generality, we assume that C has 
only relations; functions and constants are definable using 
relations. Let Lc be a set of relation symbols and let < G Lc 
be a distinguished binary relation symbol. < represents the 
subtype ordering between types. C is finite, so C is decidable 
(see Section 6 for the case when C is infinite but decidable). 

We represent type constructors as free operations in the 
term algebra with signature E. To represent the variance 
of constructors we define for each constructor / € E of 
arity ar(/) = k and each argument 1 < i < fc the value 
variance(/, i) G { — 1,1}. The constructor / is covariant in 
argument i iff variance(/, i) = 1. For convenience we assume 
ar(/) > 1 for each / e E. 

The E-term-power of C is a structure V defined as fol- 
lows. Let E' = E U C. The domain of V is the set P of 
finite ground E'-terms. Elements of C are viewed as con- 
stants of arity 1. The structure V has signature EULc. The 
constructors f € T, are interpreted in P as in a free term 
algebra: 

lffiti,...,tk) = fiti,...,tk) 

A relation r € Lc\{<} is interpreted pointwise on the terms 
of same "shape" as follows. |r]^ is the least relation p such 
that: 

1. if ^"^(01,. . . ,c„) then p(ci, . ..,c„) 
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2. if p(tii, . . . ,tin) for all i where 1 < i < k, then 
p{f {til, ■ ■ ■ ,tlk), ■ ■ ■ , f{tnl, ■ ■ • ,infc)) 

The relation < € Lc is interpreted similarly, but taking into 
account the variance. |[<|'' is the least relation p such that 

1. if [<f (ci,C2) thenp(ci,C2) 

2. if 

variance(/,i) , \ 
p \^zl, ■ • • 5 *'in/ 

for all i where 1 < i < k, then 

P{f {tllj ■ ■ ■ ,tlk), ■■ ■, f{tnl, ■ ■ -jtnk)) 

Here we use the notation p" for v € { — 1, 1} with the mean- 
ing: = p and p~^ = {{y,x) I {x,y) € p}. 

Wc next sketch the decidability of structural subtyping 
for any finite number of primitive types C. For now we as- 
sume that all constructors / G E arc covariant, the relation 
< thus does not play a special role. 

5.1 Extended Term-Power Structure 

For the purpose of quantifier elimination we define the struc- 
ture Ve by extending the domain and the set of operations 
of the term-power structure V. 

The domain of Ve is Pe — P D Ps where Ps is the set 
of shapes defined as follows. Let = {c^} U {f^ | / £ S} 
be a set of function symbols such that is a fresh constant 
symbol with ar(c^) — and arc fresh distinct constant 
symbols with ar(/^) = ar(/) for each / G S. The set of 
shapes Ps is the set of ground S'-terms. When referring to 
elements of Ve by term we mean an element of P; by shape 
we mean an element of Ps- We write to denote an entity 
pertaining to shapes as opposed to terms, so a;', denote 
variables ranging over shapes, and to denotes terms that 
evaluate to shapes. 

The extended structure Ve contains term algebra opera- 
tions on terms and shapes (including selector operations and 
tests, [22, Page 61]), the homomorphism sh, and cardinality 
constraint relations |0|ts = k and |0|ts > k: 

1. constructors in the term algebra of terms, / € E' 

lff^ti,...,tk) = f{ti,...,tky, 

2. selectors in term the algebra of terms, 

[/.r^(/(ti,---,ife))=i.; 

3. constructor tests in the term algebra of terms, 
l\sfl'^'^ {t) = 3ti, . . . ,tk. t = f(ti, . . . ,tk); 

4. constructors in the term algebra of shapes, G 

ifTHt\,---,ti)^nt\,...,tiy, 

5. selectors in the term algebra of shapes, 

lf!rHrit\,...,ti))=tv, 

6. constructor tests in the term algebra of shapes, 
[ls^s]^«(f ) = 3tl, . . .,tl. f = r{t\, . . .,tl); 

7. the homomorphism mapping terms to shapes such 
that: 



8. cardinality constraint relations 



|shl'''^(/(ti,...,i„)) = 
shapified(/)(Ishf . . . , [shr«(i„)) 



where 



shapified(a;) 
shapified(/) 



c', if X € C 
r, if / € E 



(60) 



(61) 



mxi,...,Xk)\ts^kf^{h 

\[cl>{xi,...,xk)r^{tu...,tk)\=k 



and 



{xi,...,Xk)\t^ > fcl^«(ti,...,tfc) 



[Xl, 



,Xk, 



f^{h,...,t^)\>k 



(62) 



(63) 



where 4>{xi, . . . , Xk) is is a first-order formula over the 
base-structure language Lc with free variables 
Xl, . . . ,Xk, term f denotes a shape, and is a 
nonnegative integer constant. 

It remains to complete the semantics of cardi- 
nality constraint relations, by defining the set 
|</)(a;i, . . . ,a;fc)|^®(ti, . . . ,tfc). If s is a shape, we call 
the set of positions of constant in s leaves of s, and 
denote it by leaves(s). We represent a loaf as a sequence of 
pairs (/, i) where / is a constructor of arity k and 1 < i < fc. 
If I G leaves(s) and sh(t) — s, then t[l] denotes the element 
c G C at position I in term t i.e. if I = {f^,i^) ■ ■ ■ (f'^,i") 
then 

(64) 

We define: 



mxi,...,xk)r^{ti,...,tk) = 

{I I lct>{xu...,Xk)f{ti[l],...,tk[l])} 



(65) 



The following equations follow from (65) and can be used as 
an equivalent alternative definition for cardinality relations: 



ixi,...,xk)rHci 



,Ck 



1, 



[Xl,. 



,Xk 



'(ci,...,Cfe) 



(66) 



0, ^l(l){xi,...,Xk)f{ci,...,Ck) 



{xi, a;fc)r«(/(tii, . . .,tu), f{tki, tki))\ 



= |[0(a;i,...,a;fc)r«(tii,...,tfci)|-K... 
+ \[<j>{xi,...,Xk)r^{tu,...,tki)\ 

(67) 

Definition (65) generalizes [14, Definition 2.1, Page 63]. 
We write \cj){ti, . . . ,tk)\t' = fc as a shorthand for the 
atomic formula {\<f){xi, . . . ,Xk)\t^ = k){ti, . . . ,tk), similarly 
for \4>{ti, . . . , tfc) [ts > k. This is more than a notational con- 
venience, see Section 6 for an approach which introduces sets 
of leaves as elements of the domain oiVs and defines a cylin- 
dric algebra interpreted over sets of leaves. The approach in 
this section follows [35] in merging the quantifier elimination 
for products and quantifier elimination for boolean algebras. 

Some of the operations in Ve are partial. We use the 
definitions and results of Section 2.3 to deal with partial 
functions. fi{t) is defined iff ls/(t) holds, /|(i^) is defined 
iff ls/s(t') holds. Cardinality constraints \4>{ti, . . . ,tk)\ts = 
k and \(p{ti, . . . ,tk)\t^ > k are defined iff sh(ti) = ... = 
sh{tk) = f holds. 

The structure Ve is at least as expressive as V because 
the only operations or relations present in V but not in Ve 
are [rl'' for r G Lc, and wc can express |[r]]''(ti, . . . ,tk) as 
|-.r(ti, . . . ,tfc)lsh(ti) = 0. 

Our goal is to give a quantifier elimination for first-order 
formulas of structure Ve- By a quantifier- free formula we 
mean a formula without quantifiers outside cardinality con- 
straints, e.g. the formula \Vx.x < t\x^ = k is quantifier-free. 
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5.2 Structural Bsise Formulas 

In this section we define the notion of structural base for- 
mulas for any base structure C with a finite carrier. 

Definition 56 of structural base formula for quantifier 
elimination in Ve differs from Definition 41 in the conjuncts 
of cardin subformula. Instead of cardinality constraints on 
boolean algebra terms, Definition 56 contains cardinality 
constraints on first-order formulas. 

The notion of base formula and Lemma 25 apply to terms 
P as well as shapes Ps in the structure Pe because shapes 
are also terms over the alphabet E'. For brevity we write u* 
for an internal shape or term variable, and similarly x* for 
a free shape or term variable, t* for terms, /* for term or 
shape term algebra constructor and /* for a term or shape 
term algebra selector. 

Definition 56 (Structural Base Formula) 

A structural base formula 'imth: 

• free term variables xi, . . . , Xm,; 

• internal non-parameter term variables u\, . . . ,Up; 

• internal parameter term variables Wp+i, . . . , Wp+q," 

• free shape variables a^i , . . . , ; 

• internal non-parameter shape variables u\, . . . , Mps ; 

• internal parameter shape variables Mps, . . . , Mps+qs 

is a formula of the form: 

3mi, . . . , Un, U\, . . . , U%i. 

shapeBase(wi, . . . , u%s,x\, . . . , x^s) A 
termBase(Mi, . . . , w„, xi, . . . , Xm) A 
termHom(Mi, . . . , M„, Ml, . . . , M^s) A 
cardin(Mp+i, ...,u„, Wps+i, . . . , u%%) 

where n = p + q, rv' = p'^ -\- , and formulas shapeBase, 
termBase, termHom, cardin are defined as follows. 

shapeBase(wl, . . . , tt^s, a:|, . . . , x%^s) = 

I\ u\ = ti{u\, u^s) A /\ Xi= m', 

i=l i=l 

A distinct(ui, . . . , Un) 

where each ti is a shape term of the form /'^(wi^ , • • • , ul^^ ) 
for some / € So, k = ar(/), and 

j : {1, . . . , m^} —»{!,..., n^} is a function mapping indices 
of free shape variables to indices of internal shape variables. 

termBase(Mi, . . . ,Un,xi, . . . ,Xm) = 

p m 

/\ Ui = ti(ui, . . . ,u„) A /\Xi=Uj. 

i=l i=l 

where each ti is a term of the form f{ui^ , • • • , Ui^ ) for 
some f £ T,, k — ar(/), and j : {1, . . . , m} — > {1, . . . , n} is 
a function mapping indices of free term variables to indices 
of internal term variables. 

n 

termHom(Mi, . . . ,M„,Mi, . . . ,M^s) = /\ sh{ui) = u^j. 

i=l 

where j : {1, . . . ,n} —^ {1, . . . , n^} is some function such 
that {ji, . . . ,jp} C {1, . . . ,p^} and 

{jp+i, . . . ,ip+q} C {p' + 1, . . . + q^} (a term variable is 



a parameter variable iff its shape is a parameter shape 
variable). 

cardin(Mp+i, . . . , w„, Ups+i, . . .,u„s) = i/ji A ■ ■ ■ A Vd 
where each tpi is a cardinality constraint of the form 
. . . ,Wj,)|„s = k 

or 

where {ji, ■ ■ ■ C {p + 1, . . . ,n} and the conjunct 
sh(wj^) = occurs in termHom for 1 < d < I. We require 
each structural base formula to satisfy the following 
conditions: 

PO) the graph associated with shape base formula 

3u\, . . .,Uni. shapeBase(Mi, . . .,u%,x\, . . . ,a;^s) 
is acyclic; 

PI) congruence closure property for shapeBase subformula: 
there are no two distinct variables u\ and lij such that 
both u\ = f{u\ , . . . , J and u) = f{u\ , . . . , J 
occur as conjuncts in formula shapeBase; 

P2) congruence closure property for termBase subformula: 
there are no two distinct variables Ui and Uj such that 
both Ui = f{uii,. . . ,ui^) and Uj = fiui^, . . . ,ui^) 
occur as conjuncts in formula termBase; 

P3) homomorphism property of sh ; for every 
non-parameter term variable u such that 
u = f{ui^ , • • • , Mifc) occurs in termBase, if conjunct 
sh(w) = occurs in termHom, then for some shape 
variables uj^ , ■ ■ ■ , itjj. term = /^(itJi , • • • , u'j^ ) 
occurs in shapeBase where f^ = shapified(/) and for 
every r where 1 <r < k, conjunct sh(wi^) = wj^ 
occurs in termHom. 

Note that the validity of the occur check for term variables 
follows from PO) and P3). Another immediate consequence 
of Definition 56 is the following Proposition 57. 

Proposition 57 (Quantification of Str. Base Form.) 

If (3 is a structural base formula and x a free shape or term 
variable in (3, then there exists a base structural formula /3i 
equivalent to 3x./3. 

We proceed to show that a quantifier-free formula can be 
written as a disjunction of base formulas, and a base formula 
can be written as a quantifier-free formula. 

5.3 Conversion to Base Formulas 

Conversion from a quantifier-free formula to the structural 
base formula is given by Proposition 57. The proof of Propo- 
sition 58 is analogous to the proof of Proposition 45 but uses 
of (67) instead of (25). 

Proposition 58 (Quantifier-Free to Structural Base) 

Every well-defined quantifier-free formula is equivalent 
on Ve to true, false, or some disjunction of structural base 
formulas. 
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5.4 Conversion to Quantifier-Free Formulas 

The conversion from structural base formulas to quantifier- 
free formulas is similar to the case of two constant symbols 
in Section 4.3, but requires the use of Feferman-Vaught tech- 
nique. 

Definition 59 The set dets of variable determinations of a 

structural base formula 13 is the least set S of pairs {u* ,t*) 
where u* is an internal term or shape variable and t* is a 
term over the free variables of [3, such such that: 

1. if X* = u* occurs m term Base or shapeBase, then 

{u*,x*) e 5; 

2. if {u* ,t*) G S and u* — f*{u\, . . . ,ul) occurs in 
shapeBase or termBase then 
{('aI,/r(r)),...,«,/fe*(i*))}CS; 

3. tf{{ul,fUn),...,{<JUn)}'^S and 

u* = /*(«!, • • • , Wfc) occurs in shapeBase or termBase 

then {u*,t*) e 5; 

4- if {u,t) € S and sh{u) = occurs in termHom then 
(M^sh(^)) € S. 

Definition 60 An internal variable u* is determined if 
(u*,t*) £ dets for some term t^. An internal variable is 
undetermined if it is not determined. 

Lemma 61 Let (3 be a structural base formula with matrix 
f3o and let dets be the determinations of (3. If {u'jt*} € S 
then 1= /3o w* = t* . 

Corollary 62 Let (3 be a structural base formula such that 
every internal variable is determined. Then P is equivalent 
to a well-defined quantifier-free formula. 

Proof. By Lemma 61 using 

3x.x = tA(t>{x) <^ (68) 



Lemma 63 Let u be an undetermined non-parameter term 
variable in a structural base formula (3 such that u is a source 
i. e. no conjunct of the form 

U = f{ui, . . . ,U,. . . ,Uk) 

occurs in termBase. Let [3' be the result of removing u and 
conjuncts containing u from (3. Then (3 is equivalent to (3' . 

Proof. The conjunct containing u in termHom is a conse- 
quence of the remaining conjuncts, so we drop it. We then 
apply (68). ■ 

Corollary 64 Every base formula is equivalent to a base 
formula without undetermined non-parameter term vari- 
ables. 

Proof. If a structural base formula has an undeter- 
mined non-parameter term variable, then it has an unde- 
termined non-parameter term variable that is a source. Re- 
peatedly apply Lemma 63 to eliminate all undetermined 
non-parameter term variables. ■ 

The following Lemma 65 is a consequence of the fact that 
terms of a fixed shape s form a substructure of V isomorphic 
to the finite power C™ where m = |leaves(s)| and follows 
from Feferman-Vaught theorem in Section 3.3. 



Lemma 65 Let 

a = 3w. sh(u) = A 

sh{ui)^u A... A sh{uk)=u^ A (69) 
i^i A ... A Vp 

where each ipi is a cardinality constraint of the form \(j)\u' = 
k or \(j)\u' > k where all free variables of <f> are among 
u,ui, . . . ,Uk. Then there exists formula ip such that ip 
is a disjunction of conjunctions of cardinality constraints 
— k and |0'| > k where the free variables in each (f>' are 
among ui, . . . ,Uk and formula a is equivalent on Ve to a' 
where 

a' = sh(wi) = M^ A... A sh(ufc)=w' A V (70) 

Proposition 66 (Struct. Base to Quantifier- Free) 

Every structural base formula f3 is equivalent on Ve to 
some well-defined quantifier-free formula 4>. 

Proof Sketch. By Corollary 64 we may assume that /3 has 

no undetermined non-paramotor term variables. By Corol- 
lary 62 wc arc done if there arc no undetermined variables, 
so it sufBcos to olirninato undetermined parameter term vari- 
ables and undetermined shape variables. 

Let u be an undetermined parameter term variable, u 
does not occur in termBase because it cannot have a succes- 
sor or a predecessors in the graph associated with term base 
formula. Therefore, u' occurs only in termHom and cardin. 
Let be the shape variable such that = sh(u) occurs 
in termHom. Let tpi, . . . ,tpp be all conjuncts of cardin that 
contain u. 

Each ^pi is of the form \(t>\u= > ki or \(f>\u= = fci and for 
each variable u' free in 4> the conjunct sh(M) = occurs 
in termHom. The base formula can therefore be written in 
form 

/3i = 3xi, . . . ,Xe,x\, . . . ,xy (j) Aa 

where a has the form as in Lemma 65. Applying Lemma 65 
we eliminate u and obtain = ViLi '^i where and each ai is 
a conjunction of cardinality constraints. Base formula /3i is 
thus equivalent to the disjunction ViLi where each /3i,i 
is a base formula 

= 3X1, . . . ,Xe,x\,. . . ,x}. (p A4'l,i 

By repeating this process we eliminate all undetermined pa- 
rameter term variables from a base formula. Each of the 
resulting base formulas contains no undetermined term vari- 
ables. 

It remains to eliminate undetermined shape variables. 
This process is similar to term algebra quantifier elimi- 
nation; the key ingredient is Lemma 25, which relies on 
the fact that undetermined parameter variables may take 
on infinitely many values. We therefore ensure that un- 
determined parameter shape variables are not constrained 
by term and parameter variables through conjuncts outside 
shapeBase. 

Consider an undetermined parameter shape variable u^. 

does not occur in termHom, because all term variables 
are determined and a conjunct u'^ = sh(u) would imply that 

is determined as well, can thus occur only in cardin 
within some cardinality constraint \4>\u^ = k ot |0|„s > k. 
Moreover, formula <p in each such cardinality constraint is 
closed: otherwise would contain some free variable u, by 
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definition of base formula u would have to be a parame- 
ter variable, all parameter term variables are determined, 
so would be determined as well. Let denote some 
shape s. Because </> is a closed formula, is equal to 
if 10]'^ = false and to the shape size m = |leaves(s)| if 
[•Al*^ = true. (The fact that closed formulas reduce to the 
constraints on domain size appears in [35, Theorem 3.36, 
Page 13].) After eliminating constraints equivalent to = /c 
and > fc, we obtain a conjunction of simple linear con- 
straints of the form m = k and rn > k. These constraints 
specify a finite or infinite set S C {0, 1, . . .} of possible sizes 
m. Let j4 = {s I [leaves(s)| G S}. If the set S is infinite 
then it contains an infinite interval of form {mo, mo + 1, . . .} 
so the set A is infinite. If S contains a unary construc- 
tor and S is nonempty, then A is infinite. If E contains 
no unary constructors and S is finite then A is finite and 
the cardinality constraints containing are equivalent to 
Vf^i = tl where A — {t\, . . . ,tp}. We therefore goner- 
ate a disjunction of base fornmlas . . . , 0p where /3i re- 
sults from by replacing cardinality constraints containing 

with with — t}. We convert each j3i to a disjunction 
of base formulas by labelling subterms of U with internal 
shape variables and doing case analysis on the equality be- 
tween new internal shape variables to ensure the invariants 
of a base formula, as in the proof of 58. By repeating this 
process for all shape variables where the set S is finite, 
we obtain base formulas where the set A is infinite for every 
undetermined parameter shape variable it^. We may then 
eliminate all undetermined parameter and non-parameter 
shape variables along with the conjuncts that contain them. 
The result is an equivalent formula by Lemma 25. 

All variables in each of the resulting base formulas are 
determined. By Corollary 62 each formula can be written 
as a quantifier-free formula, and the resulting disjunction is 
a quantifier-free formula. ■ 

5.5 One-Relation-Symbol Variance 

So far we have assumed that all constructors are covariant. 
In this section we describe the changes needed to extend 
the result to the case when the constructors have arbitrary 
variance with respect to some distinguished binary relation 
denoted <. 

Definition 67 If (p is a first-order formula in the language 

Lc the contravariant version of (f>, denoted (t>^~^\ is defined 
by induction on the structure of formula by: 



We clearly have for every formula 4> and every valuation a: 





-1) 


= r{ti 






-1) 


= t2< 


ti 


(01 A 02)^ 


-1) 




A<^2 


(</>! V(^2)* 


-1) 




A<^2 




-1) 


= -,(t>^- 


-1) 


(3t.<^)( 


-1) 


= 3t.4> 


(-1) 


(Vt.<^)( 


-1) 


= \/t.4> 


(-1) 



(71) 

Define to have the same domain and same interpretation 
of operations and relations r € Lc \ {<} but where 



If / G leaves(s) is a leaf / — {f^,i^ 
variance(Z) as the product of integers 



Y[ variance(/^, i^) 



(73) 

.(r,i"), define 
(74) 



i=i 



We generalize (65) to 
[0(a;i, 

{I 



Xl, 



Hti,...,tk) = 
.,xk)f(ti[i],...Mi])} 



(75) 



whore C' denotes C for variance? = 1 and C^^ for variance/ — 
1. Hence, isomorliism between terms of some fixed shape 
s with |leaves(s)| = m and C" breaks, but there is still an 
isomorphism with C^'-'^ x {C~^)'^'-''^ where 



P{s) = \{l e leaves(s) | variance(0 = 1}| 
N{s) = \{l € leaves(s) | variance(Z) = -1}| 



(76) 



Because of this isomorphism. Lemma 65 still holds and we 
may still use Feferman-Vaught theorem from Section 3.3. 
Equation (67) generalizes to: 



mxi,.. 



■,xk)r^{f{tii,...,tu),...,f{tku. 



■,tkl))\ 
■ ,tki)\ 



(77) 

The only change in the proof of Proposition 58 is the use 
of (77) instead of (67). Most of the proof of Proposition 66 
remains unchanged as well; the only additional difficulty is 
eliminating constraints of the form \(f>\u^ = k and \(l>\u^ > k 
where is a parameter shape variable and is a closed 
formula. Lemma 68 below addresses this problem. 

We say that an algorithm g finitely computes some func- 
tion f : A^2^ where B is an infinite set iff g is a function 
from A to the set Fin(iJ) U {oo} where f\n{B) is the set of 
finite subsets of set B, oo is a fresh symbol, and 



9{a) 



f{a), if /(a) € Fin(B) 
oo, if /(o) i Fin(B) 



(78) 



Lemma 68 There exists an algorithm that, given a shape 
variable and a conjunction %p = tpi of cardinality 

constraints where each ipi is of form \(j)i\u^ = ki or \cl>i\u' > ki 
for some closed formula cbi , finitely computes the set 

A = {s\ m^[n^ ^ s]} (79) 

of shapes which satisfy tj) in V ■ 

Proof Sketch. Let be a closed formula in language Lc- 
Compute \4>f' and "'^'']'' and then replace l^ls with one 
of the expressions P{s) + N{s), P{s), N{s), according to 
the following table. 



[<r = ([<r) 



C\-l 



(72) 







101.= 


true 


true 


P{s)+N{s) 


true 


false 


Pis) 


false 


true 


N{s) 


false 


false 






(80) 
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The constraints of the form N{s) + P{s) — k and A'^(6-) + 
P(s) — k can be expressed as propositional combinations of 
constraints of the form jV(.s) = A;, P{s) = k, P(s) > k and 
N(s) > k. Therefore, -ip can be written as a propositional 
combination of these four kinds of constraints and each con- 
junction C(s) can further be assumed to have one of the 
forms: 

Fl) Ckp,k^{s) = P{s) = kpAN{s) = kN; 

F2) C-fcp.fc+W = P{s) = kp A N{s) > kN-, 

F3) = P{s) > kp A N{s) = kN-, 

F4) (s) = P{s) >kpA N{s) > kN. 

Let A = {s € Ps \ C!{s)}. To compute A when S contains 
unary constructors, we first restrict E to the language S' 
with no unary constructors, and compute the set A' C A 
using language E'. If A' is empty, so is A, otherwise A is 
infinite. Assume that E contains no unary constructors. As- 
sume further E contains at least one binary constructor and 
at lest one constructor is contravariant in some argument. 
Let 

S={{P{s),N{s))\s€A} 

Because P{s) + N{s) — |leaves(s)| and there are only finitely 
many shapes of any given size (every constructor is of arity 
at least two), it suffices to finitely compute S. S can be 
given an alternative characterization as follows. If / £ E, 
ar(/) = fe, / is covariant in / arguments and contravariant 
in k—l arguments define 

lfi^{{pi,n'i),---,{Pk,nk)) = 

(81) 

(E •=! Pi + E ti+i "i, E Li rii + Eli+i Pi) 

Let U be the subset of {(p, n) \ p,n > 0} generated from 
element (1,0) using operations |/|'^ for / G E. Then 

S={{p,n) eU\c{p,n)} (82) 

where c(p, n) is the linear constraint corresponding to the 
constraint C(s). 

Let C{s) = Cfcp,fc„(s). Then S C {{p,n) \ p + n = 
kp + kN}- S is therefore a subset of a finite set and is easily 
computable, which solves case Fl). 

Let C(s) = C,+ , + (s). Because E contains a binary con- 

structor, S contains pairs {p, n) with arbitrarily large p+n, 
so either the p components or n component of elements of 
S grows unboundedly. Because E contains a constructor / 
contravariant in some argument, we can define using / an 
operation o acting as a constructor covariant in at least one 
argument and contravariant in at least one argument. Using 
operation on tuples whose one component grows unbound- 
edly yields tuples whose both components grow unbound- 
edly. Therefore, S is infinite, which solves case F4). 

Finally, consider the case C(s) = C^,^ fe+ (s) (this will 

solve the case C(s) = (7^.+ ^^^(s) as well). Observe that 

kp-l 

Cfep,fe+(s) = ^fcp,o+(s)A f\ -Cfep,,(s) (83) 

i=0 

Because the set S for each Ckp,i{s) is finite, it suffices to 
finitely compute 5 for Cfcp_o+(s)- In that case 

S={{p,n) eU\p = kp} (84) 



Let 

& = {{p,n)&U\p = i} 
Ti = {{p,n) €U\n = i} 

To finitely compute S, finitely compute the sets Si and Ti 
for < i < fep. The algorithm starts with all sets Si and T, 
empty and keeps adding elements according to operations 
Iff- 

Assume that So, To, • • • , 5,-1, Ti_i are finitely computed. 
The computation of Si and Ti proceeds as follows. Let / € E 
be a constructor of arity k with I covariant arguments. For 
Si we consider all solutions of the equation 

pi H \-pi+ ni+i + .. . + nk =i (86) 

for nonnegative integers pi, . . . ,pi,ni+i, . . . ,nk. First con- 
sider solution solutions where no variable is equal to i. If for 
one of the solutions, one of the sets Sp^, . . . , Sp, is infinite, 
then Si is infinite, otherwise add to Si all elements {i, n) 
where 

n = ni-\ \- ni + pi+i + . . . + pk (87) 

If n < kp then also add the same elements {i,n} to T„. 
Next, proceed analogously with Ti, considering solutions of 

ni H h n; + p(+i + . . . + Pk = i (88) 

If at this point Si is not infinite and not empty, then also 
consider the solutions of (87) where pj — i for some j. If 
such solution exists, then mark Si as infinite. Proceed anal- 
ogously with Ti. Finally, if both Si and Tj are still finite 
but there exists a solution for Si where ni+j = i for some j 
and exists a solution for Tj where pi+d = i for some d, then 
mark both Si and T as infinite. This completes the sketch 
of one step of the computation. (This step also applies to 
So and To; we initially assume that (1, 0) € To.) ■ 

Example 69 Let us apply this algorithm to the special case 
where E = {f,g} and 

variance((;) = (1, 1) 

variance(/) = (—1,1) 

Let us see what the set S looks like. If {x,y} € S define 
k{x, y) = {kx, ky) as in a vector space. 

First, (1,0) G S because of Next (1,1) G S because 
of and (2, 0) G S because of g^. 

More generally, we have the following composition rule: 
If (pi,ni), (p2,n2) then 

(pi +P2,ni +n2} e S 

because of g^, and 

(ni +p2,pi +n2) € S 

because of f^. 

Using g^ we obtain all pairs (p, 0) for p > 1. Using 
once on those we obtain (1, n) for n > 0. Adding these we 
additionally obtain (p, n) for p > 2 and n > 0. Hence we 
have all pairs (p, n) for p > 1 and n > and those are the 
only ones that can be obtained. Thus, 

S = {{p,n} \p>lAn>0} 

As expected, the ease Fl) yields a finite and the case F4) 
an infinite set. The case F2) for fcp = is an empty set, 
otherwise it is an infinite set. The case F3) always yields 
an infinite set. This solves the problem for two constructors 

f,g- 
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lifted relations r' for r € Lc 

r' :: term*' — » bool 
term algebra on terms 

constructors, / € E: 

/ term* — » term 
constructor test, / G S; 
Is/ :: term bool 
selectors, / G S; 
/i :: term —^ term 



Figure 8: Basic Operations of S-term-power Structure 

♦ 

Lemma 68 allows to carry our the proof of Proposition 66 
so we obtain our main result for finite C. 

Theorem 70 (Term Power Quant. Elimination) 

There exists an algorithm, that for a given well-defined 
formula 4> produces a quantifier-free formula 4>' that is 
equivalent to (j) on Ve ■ 

Corollary 71 (Decidability of Structural Subtyping) 

Let C be a structure with a finite carrier and V a Ti-term- 
power of C. Then the first-order theory ofV is decidable. 

6 Term-Powers of Decidable Theories 

In this section we extend the result of Section 5 on decidabil- 
ity of term-powers of a base structure C to allow C to be an 
arbitrary decidable theory, even if the carrier C is infinite. 

To keep a finite language in the case when C is infinite, 
we introduce a predicate Ispri that allows testing whether 
i € C for a term t e P. 

In structural base formulas, we now distinguish between 
1) composed variables, denoting elements t € P for which 
ls/(t) holds for some constructor f £ T,, and 2) primitive 
variables, denoting elements t £ P for which IspRi(t) holds. 

Another generalization compared to Section 5 is the use 
of a syntactically richer language for term power algebras; to 
some extent this richer language can be viewed as syntactic 
sugar and can be simplified away. 

The generalization to infinitely many primitive types and 
the generalization to a richer language are orthogonal. 

For most of the section we focus on covariant construc- 
tors. Section 6.5 discusses a generalized notion of variance. 

As in Section 3.3 let C = {C, R) be a decidable structure 
where C is a non-empty set and i? is a set of relations inter- 
preting some relational language Lc, such that each r £ R 
is a relation of arity ar(r) on set C, i.e. r C Wg 
assume that R contains a binary relation symbol € R, 
interpreted as equality on the set C. 

Operations and relations of the E-term-power structure 
are summarized in Figure 8. We will show the decidability of 
the first-order theory of the structure with these operations. 

In the special case when C — {a, b} and 

r = {{a,a),{a,b),{b, b)} 



we obtain the theory in Section 4. When R = {r} where r is 
a partial order on types, we obtain the theory of structural 
subtyping of non-recursive covariant types. For arbitrary 
relational structure C, if / £ E for ar(/) = k we obtain a 
structure that properly contains the fc-th strong power of 
structure C, in the terminology of [35]. 

The structure of this section follows Sections 4. We also 
associate a boolean algebra of sets with each term t. How- 
ever, in this case, the elements of the Eissociated boolean 
algebra are sets of occurrences of the constants that sat- 
isfy the given first-order formula interpreted over C. The 
occurrences of constants within the terms of a given shape 
correspond to the indices of the product structure in Sec- 
tion 3.3. We call these occurrences leaves, because they can 
be represented as leaves of the tree corresponding to a term. 

6.1 Product Theory of Terms of a Given Shape 

In this section we define the notions shape and leafset, and 
state some properties that we use in the sequel. 
Let 

Eo = {c^} U {f I / e E} 

be a set of function symbols such that is a fresh constant 
symbol with ar(c^) — and f^ are fresh distinct constant 
symbols with ar(/^) = ar(/) for each / £ E. Let shapified : 
E' — > Eo be defined by 

shapified (a;) = c^, if a; e C 
shapified(/) = f, if/GE 

Let FT(Eo) be the set of ground terms with signature Eo 
and FT(E') the set of ground terms of signature E'. 

Define function sh :: FT(E') — » FT(Eo) mapping each 
term to its shape by 

sh(/(ti, . . . ,t„)) = shapified(/)(sh(ti), . . . , sh{t„)) 

for each / G E'. Define ti ~ t2 iff sh(ti) = sh(f2). 

Let f be a term or shape and t' the tree representing t as 
in Section 2.2. If p is a path such that t'ip) is defined and 
denotes a constant, we write t\p] to denote t'{p) and call p a 
leaf. Note that t\p] is defined iff sh(t)[p] is defined. On the 
set of equivalent terms leaves act as indices of Section 3.3. If 
s is a shape, let leaves(s) denote the set of all leaves defined 
on shape s. 

Generalizing tCont of Section 4.1, define function tCont : 
FT(E') ^ C by: 

tCont(c) = c',ffceC 
tCont(/(ii,...,tfc)) = tCont(ti) • ... •tCont(tfe) 

Define S{t) = (sh(t), tCont(t)) and 

B = {{s,w) 1 s G FT(Eo),«; G C*,tLen(s) = sLen(w)} 

If all constructors / G E are covariant then 5 is a bijection 
between FT(E') and B. Let 

B{so) = {{s, w) €B\s = So} 

For a fixed so, the set B{so) is isomorphic to the power 
structure C" where n = tLen(s). 

For each shape s we introduce operations from Sec- 
tion 3.3. To distinguish the sets of positions belonging to 
different shapes, we tag each set of positions L with a shape 
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s. We call the pair (s,L) a leafset. The interpretation of 
each relation r £ Lc is the leafset: 

|r.l(ti, . . . = (s, {p I [rf (fib], • • ■ ,tfcb])}> 

We let A3, v!,, true!,, false!, stand for intersection, union, 
complement, full set and empty set in the algebra of subsets 
of the set leaves(s). We also introduce 3!, as the union of a 
family of subsets indexed by a term of shape s and as the 
intersection of a family of subsets indexed by a term. 

We use constructor-selector language for the term alge- 
bra on terms. We introduce constructor-selector language 
on shapes by generalizing operations in Section 4.1 in a nat- 
ural way. In addition, wc introduce a constructor-selector 
language on leafsets. For each / £ E we introduce a con- 
structor symbol on leafsets and define 

leafified(/) = 

Constructors /'" act on leafsets as follows. If C leaves(si) 

for \ <i < k define 

/'"((si,Li),...,(sfe,Lfc)) = (s,L) 

where s = /^(si, . . . , Sk), and L C leaves(s) is given by 

L = ({l}-Li) U---U ({k}-Lk) 

(Here we define A-B = {a-b\a£A/\b£ B}.) 

We define selector functions on leafsets as follows. If s = 
/^(si, . . . , Sfc) and L C leaves(s), then f\{{s,L)) = {si,Li) 
where Li C leaves(si) is defined by 

Li — {w \ w ■ i £ A} 

Equivalently, we require that 

/^(/((6-i,Li>,...,(.s-„,i„») = (s„L,) 

We can now express relations r' in Figure 8 using the fact: 

r'(ti,...,ife) <^ 

sh(t2) = sh(ii) A ... A sh(tfc) = sh(ti) A (89) 

. . . ,tk) =true3h(ti) 

To handle an infinite number of elements of the base 
structure C, we do not introduce into the language constants 
for every element of C as in Section 5. Instead, we introduce 
the predicate Ispri :: term — > bool called primitive-term test 
that checks whether a term is a constant: 

ISPRl(x) = {x€C) 

and the predicate ISppiL :: leafset — »■ bool called primitive- 
leafset test: 

ISpRiL ((s,L)) = (s = c=) 
Instead of the rule (16), we have for /, g e S U {PRI}: 

Vx. V ls/(a;) 

/eEu{PRi} (90) 
Vx. -.(Is/(x) A lsg(a;)), for / ^ g 

Analogous rules hold for term algebra of leafsets: 

Vx. V ls/L(a;) 

/eEu{PRi} (91) 

Vx. -.(ISy.L(x) A ISgL(a;)), for / ^ 3 

Term algebra of shapes satisfies the original rules (16) of 
term algebra. 



6.2 A Logic for Term- Power Algebras 

To show the decidability of the first-order theory of the 
structure FT* with operations in Figure 8, we show decid- 
ability for a richer structure. Figure 9 shows the operations 
and relations of this richer structure. 

The structure has four sorts: bool representing truth val- 
ues, term representing terms, shape representing shapes, and 
leafset representing sets of leaves within a given shape. The 
structure can be seen as as a combination of the operations 
of Figure 5 and Figure 2. 

For each relation symbol r € Rwe define a relation sym- 
bol r* of sort shape x term*' — > bool acting on terms of the 
same shape. While in Section 4.2 we associate a boolean 
algebra with the terms of same shape, in this section we 
associate a cylindric algebra [21] with terms of the same 
shape. This is a particularly simple cylindric algebra re- 
sulting from lifting first-order logic on the base structure 
C so that elements are replaced by terms of a given shape 
(which are isomorphic to functions from leaves to elements), 
and boolean values are replaced by sets of leaves (isomor- 
phic to functions from leaves to booleans). In both cases, 
operations on the set X are lifted to operations on the set 
leaves(s) X. Syntactically, we introduce a copy of all 
propositional counectives and quantifiers: a|_, v|_, true|_, 
false|_. Like boolean algebra operations in Figure 5, these 
syntactic constructs in Figure 9 take an additional shape 
argument, because term-power algebra contains one copy of 
a strong power C" of base structure for each shape. We call 
formulas built using the operations of the cylindric algebra 
inner formulas. 

For each operation in Figure 2 there is an operation in 
Figure 9, potentially taking a shape as an additional argu- 
ment (for operations used to build inner formulas) . The logic 
further contains term algebra operations on terms, leafsets, 
and shapes. 

We use undecorated identifiers (e.g. u) to denote vari- 
ables of term sort, variables with superscript S to denote 
shape variables (e.g. u^) and variables with superscript L to 
denote leafset variables (e.g. u'"). 

Figures 10 and 11 show the semantics of logic in Fig- 
ure 9. The first row specifies semantics of operations in the 
case when all arguments are defined and are in the domain 
of the operation. The domain of each operation is in the 
second column, it is omitted if it is equal to the entire do- 
main resulting from interpreting the sort of the operation. 
All operations except for plain logical operations and quan- 
tifiers over the bool domain are strict. Logical operations 
and quantifiers over the bool domain arc defined as in the 
three- valued logic of Section 2.3. 

We remark that values of leafset act as terms with two 
constants in Figure 5. In fact, if the base structure C has 
only two constants then the formula x = a and its proposi- 
tional combinations are sufficient to express all facts about 
C, so in that case there is no need to distinguish between 
terms and leafsets. 

6.3 Some Properties of Term-Power Structure 

In this section we establish some further properties of the 
term-power structure, including the homomorphism proper- 
ties between the term algebra of terms and the term algebra 
of leafsets. We also argue that it suffices to consider a re- 
stricted class of formulas called simple formulas. 
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per-shape product structure 

inner formula relations for r € Lc: 

r_ shape x term*^ — > leafset 
inner logical connectives: 

a'_, \/'_ shape x leafset x leafset leafset 
leafset — > leafset 
true|_, false|_ :: leafset 
inner formula quantisers: 

^'-y'^- shape X (term — > leafset) — > leafset 
leafset equality: 

leafset x leafset — » bool 

leafset cardinality constraints, k > 0; 
\-\->k, \-\- = k 
leafset quantifiers: 



term equality: 



shape X leafset bool 
(leafset — » bool) bool 



term x term — > bool 



term quantifiers: 

3,V :: (term — » bool) — » bool 
shape equality: 

=^ :: shape x shape —^ bool 
shape quantifiers: 

■■■■ (shape -» bool) -» bool 

logical connectives: 

Ay :: bool X bool bool 
-1 bool — > bool 
true, false, undef bool 



term algebra on terms 

constructors, f £ T,: 

f :: term'' term 
constructor test, / £ E; 

Is/ term —^ bool 
primitive-term test: 
IspRi :: term bool 
selectors, / £ E; 

fi :: term — » term 
term shape: 

sh :: term — > shape 

term algebra on leafsets 

constructors, / £ S; 

/'- :: leafset'^ ^ leafset 
constructor test, f &T,: 

Is^L :: leafset bool 
primitive-leafset test: 
IspRiL :: leafset ^ bool 
selectors, / G S; 

/I leafset — » leafset 
leafset shape: 

Issh :: leafset — » shape 

term algebra on shapes 
constructors, f € Eq; 

:: shape'' — » shape 
constructor test, / £ Eo: 
Is/s :: shape — » bool 
selectors, / £ E; 

/I shape — » shape 



Figure 9: Operations and relations in structure V 
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interpretation of sorts 
Iterml = FT(E') 
[shape] = FT(Eo) 
lleafsetl = {{s,L) \ L C leaves(s)} 
|bool] = {true,false, undef} 



semantics 


well-definedness 


inner formulci relcttions for v G Lc ' 






[[/ jj^i), fci, . . . , t.fc^ — \*? l*' 1 11' Jl v^i L*"]' • • • ' L'J/ J / 


sh(ti) = 8 A 


. . . A sh(tfc) = s 


inner logicsd connectives'. 






li'^ JV^i -'^l/; v^2, J^2) ) — \S, Ivl 1 1 1^2} 


Sl = S A S2 = 


= s 


|[V |(_S, (Sl, _Ll), (&2, -tv2;j — [S,LlDL2) 


Si = S A S2 = 


= s 


hK-s, (si.-^^i)) = (•s',leaves(s)\Li) 


Sl = s 




rr . 1 Tl / \ /I / \ \ 

|[true |(s) = (s, leaves(s)) 






|falsel(s) = (s,0> 






inner formuia quantifiers, for ft : |term| — > |[leafset|; 






|3'](s,ft) = (s, Uii' 1 3t € Iterml. sh(t) = sAft(t) = (s,L)}) 


Vt € [term]. 


lssh(ft(t)) = s 


IV'](s,ft) = (n{i|3te[terml. sh(t) = sAft(t) = (s,L)},) 


V c ^tci " ' II . 


\':t;h(h(t'\') — i 

1331 1 WtT t I J O 


leafset equality: 






I='-l((si,Ll),(s2,L2» = Sl=S2ALl=L2 






leafset cardinality constraints: 






[|(si,Li)|,, > fcl = (|ii|>fe) 


Sl = S 




l\{s^,L^}\s = ^ = (|ii| = fc) 


Sl = S 




leafset quantifiers, for h : |[leafset| [bool]; 






p'-|ft = 3(s,t) e [leafset]. ft((s,t)) 






|[V |ft = v(s,t) £ |leatsetj]. n({s,t)) 






term equality: 






l=Utl,t2) = {tl=t2) 






term quantifiers, for h : |term] — » |bool]; 






Plft = 3te [termj. h{t) 






|V]ft = Vt e [term]. h{t) 






shape equality: 






l-l{t\,tl) = (tl^tl) 






shape quantisers, for h : [shape] —> [bool]; 






[3=] ft = 3t € [shape]. h{t) 






[V=]ft = Vt € [shape]. h{t) 







Figure 10: Semantics for Logic of Term-Power Algebra (Part I) 
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semantics 


well-deflnedness 




term algebra on terms 






constructors, / G S: 








imti,...,tk) 


= f{ti,...,tk) 






constructor test, / £ S: 










= 3ti,...,tk.t = f{ti,...,tk) 






primitive-term test: 








psPRiKi) 


= (teC) 






selectors, / € E; 








um 


= eti.t = f{ti,...,ti,...,tk) 






term shape: 








Ish(/(ti,...,t„))l 


= shapified(/)(sh(ti), . . . , sh(tn)) 
term algebra on leafsets 






constructors, / G E: 








I/'-l((si,Li),...,(sfc,Lfc)) 


= {f{su...,Sk),{{l}-Li) U---U m-Lk)} 






constructor test, / € E: 








IIs/l1({s,l)) 


= 3si,Li,...,Sfc,Lfc. (s,L) = |/'-l((si,Li),... 


, {sk,Lk)) 




primive-Jeaiset test: 








[lspR,Ll({s,L)) 


= (s = c^) 






selectors, / EE; 










= e{si,Li). {s,L) = lfl{{si,Li),...,{si,Li),. 


■ ■ , {sk,Lk)) 


l[ISfLl((s,L» 


leafset shape: 








[lsshl((s,L)) 


= s 

term algebra on shapes 






constructors, / € E; 








[r](si,...,sfc) 


= f{si,---,Sk) 






constructor test, / € Eo; 








Ils/=l(s) 


= 3si,...,Sk- s = f{si,...,Sk) 






selectors, / € E; 










= tSi. S = fisi, ...,Si,...,Sk) 




Ils/sl(s) 



Figure 11: Semantics for Logic of Term-Power Algebra (Part II) 
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Recall that r E R is the equality relation on C. Given 
r^, we can express the equality between terms by: 

ti = h r"'iti,h) 

<^ sh(fe) =sh(ti) Ar=(ti,t2) ^true^^^^j 

(92) 

We define the notion of a w'^-terni as in Definition 40 
except that we use different symbols for boolean algebra 
operations. 

Definition 72 (u^-terms) Let £ Var'^ be a shape vari- 
able. The set of -terms Term(M^) is the least set such that: 

1. ti'" € Term(M') for every leafset variable v!~ ; 

2. false^s, trusts G Term(M^); 

3. ift\,t2 G Term(M^), then also 

ti aU t2 e Term(u'), 

ti th £ Term(u'^), and 

—'u^ti € Term(u'^) 

If f is a term of shape sort, the notion of f^-inner formula 
is defined as follows. 

Definition 73 (u^-inner formula) Let £ Var'^ be a 
shape variable. The set of u^-inner formulas lnner(-u'') is 
the least set such that: 

1. if ui, . . . , uk are term variables and r € Lc such that 
ar(r) = k, then 

r„s(ui, . . .,Uk) e lnner(M'') 

2. false^s, truejjs G lnner(u'^) 

3. if (j)i,(l)2 G Inner(u^) then also 

4>i a!js 02 G lnner(M'') 
01 V^s 02 G Inner(M') 
-ius0i G Inner(M') 

4- if 4' ^ Inner(u^) and u is a term variable that does not 
occur in u^, then also 

3usU.0 G Inner(M^) 

yusu.cp G Inner(M^) 

If (f) Inner(M^) and ui, . . . ,u„ is the set of free term vari- 
ables of 0, we write 4i{u^ , ui, . . . ,u„) for 0. Furthermore, if 
t^ is a term o/ shape sort and ti, . . . ,t„ terms of term sort, 
we write 4i{f, ti, . . . ,tn) for 

0[w' := f,ui ~ ti, . . . ,u„ := t„] 

where we assume that variables bound by 3|_ and V|_ are 
renamed to avoid the capture of variables that are free in 

f,ti, . . . ,t„. 

We call (f>{f, ti, . . . , tn) an instance of the -inner for- 
mula (f>{u^,ui, . . . ,u„). 

If (j)(u^ ,U\, . . . ,Un) is an inner formula, we abbrevi- 
ate it by writing [0'(ui, . . . , tin)]^* where 0' results from 
4>{u^,ui, . . . ,Un) by omitting the shape argument from 
the operations occurring in <j){u^,ui, . . . ,u„). Similarly, we 
write [0'(ti, . . . , tn)]t^ for 0(f , ti,. . .,tn). 



According to the semantics in Figure 10, sh is a homo- 
morphism from the term algebra of terms to the term alge- 
bra of shapes. In addition, Issh is a homomorphism from the 
term algebra of leafsots to the term algebra of shapes. 

We also have the following important property. Let r G 
Lc be a relation symbol of arity n, let / G E be a function 
symbol of arity k, and let 



sh(ti 



■ sh{t„ 



for I < j < k. If /= = shapified(/), /'- = leafified(/), and 
s = /'(si, ...,Sk) then 



• ■ ■,tlk), • • • , f{tnl, t„k)) = 

• • ■,tnl), . ■ ■,rsi,{tlk, ■ . .,tnk)) 



(93) 



Furthermore, if Issh(ij) = lssh(/^) = Sj for 1 < j < and 
s = /'(si, ...jSk) then 

f^{h,...,h)A's f^{i[,...A) = 
f^{h Z'l, . . . ,lk f\^sk ^'k) 

f^{ii,...,ik)y\ f^(i[,...A) = 

f^ih aL, l[,...,lk a's, I'k) 

-^'sf'~{ll,. ■ ■ ,lk) =^ 



(94) 



3[t.f^{hi{t),...,hk{t))='- 

f^{3l^t.hi{t),...,3[j.hk{t)) 

y[t.f^{hi{t),...,hk{t))='- 

f^{y[j.hi{t),...,y[^t.hk{t)) 

From these properties by induction we conclude that if 
0(w^, wi, . . . , Un) is an inner formula, then 



0(s,/(ii 



,tlk), f(tnl, ■ ■ .,tnk)) = 



/'" (</>(«! I til, ■ • -itnl), ■ ■ ■ ,4>{Sk,tlk, ■ ■ ■ ,tnk)) 



(95) 



Let 0(m^, Ml, . . . , M„) be an inner formula and lot 
4>'{u\, . . . ,Un) be a first-order formula that results from re- 
placing operations aI, , Vs , , V!5,3!5 by A, V, ^, V, 3. Inter- 
preting 4>'{ui, ...,«„) over the structure C yields a relation 
p' C C7". If 

sh(ti) = . . . = sh(ifc) = s 

then 



,ti,...,tk) = {s,{l I p'{tl[l],...,tk[l])}) 



The following Definition 75 introduces a more restricted 
set of formulas than the set of formulas permitted by sort 
declarations in Figure 9. We call this restricted set of for- 
mulas simple formulas. One of the main properties of simple 

formulas compared to arbitrary formulas is that simple for- 
mulas allow the use of operations 3|_,V|_, and relations r_, 
r G Lc only within instances of w^-inner formulas. 

Definition 74 A simple operation is any operation or re- 
lation in Figure 9 except for operations 3|_, Vl_, and relations 
r_ for r G Lc . 



36 



Definition 75 (Simple Formulas) The set of simple for- 
mulas is the least set that satisfies the following. 

1. if 4>{u^,ui, . . . ,Un) is a an inner formula, f a term of 
shape sort, ti,...,tn terms o/ term sort and is a 
leafset variable, then 

U' ='" (j){f,tl, ...,tn) 

is a simple formula. 

2. applying simple operations to simple formulas yields 
simple formulas. 

Example 76 A formula 

«'■='■ 3usti- It) (96) 

is not a simple formula for u\ ^ u%. Formula 

(ul = ul A It'- ='- aJ^sM. r^s («,«)) V 
{u\ / ^2 A undef) 

is a simple formula equivalent to formula (96). We abbrevi- 
ate 3L|W. r^s (m, m) as [3'u. 'r(u, u)]„s. 

♦ 

Lemma 77 shows that for every formula in the logic of 
Figure 9 there exists an equivalent simple formula. Note 
that even simple formulas are sufficient to express the re- 
lations of structural subtyping. A reader not interested in 
the decidability of the more general logic of Figure 9 may 
therefore ignore Lemma 77. 

Lemma 77 (Formula Simplification) For every well- 
defined formula in the logic of Figure 9 there exists an equiv- 
alent well-defined simple formula. 

Proof Sketch. According to the definition of simple for- 
mula, wc need to ensure that every occurrence of quantifiers 
V|_, 3|_ and relations r_ is an occurrence in some inner-formula 
instance <p{t^, ti, . . . , t„). Each occurrence rts{ti, . . . , t„) is 
an inner formula instance by itself, so the main difficulty is 
fitting the quantifiers V|_ and 3l_ into inner formulas. 

Lot us examine the syntactic structure of formulas of 
logic in Figure 9. This syntactic structure is determined 
by sort declarations. Each expression of leafset is formed 
starting from 

1. relations r G Lc; 

2. leafset variables; 

3. true|.,false|. 

using operations a|_, v|_, V|_, 3|_, as well as /'" and f^. The 
leafset expressions can be used in a formula in the following 
ways (in addition to constructing new leafset expressions): 

1. to compare for equality using ='"; 

2. to test for the top-level constructor using Is^l; 

3. to form leafset cardinality constraints; 

4. to form a shape using Issh. 



Because the top-level sort of a formula is bool, every 
term tQ of sort leafset occurs within some formula ti =~ t\ 
or lsj.L(t'"), |t'"|ts — k, |f'"|ts > fc or as part of some term 

lssh(t'"). We can replace \SfL{t'~) with 

3m'". u~ ='" t^ a' Isjl(w'") 

according to Lemma 10, so we need not consider that case. 
We can similarly eliminate non-variable leafset terms from 
cardinality constraints. If a leafset term t'~ occurs in an 
expression lssh(t'"), we consider the smallest atomic formula 
ip{\ssh{t^)) enclosing lssh(t'"), and replace ip{t'~) with 

3w^ w'- 1"- a' iP{u^) 

This transformation is valid by Lemma 10 because ip and 
Issh are strict. 

We further assume that in every atomic formula t\ —'~ t^, 
the term t\ is a leafset variable. 

Suppose that a term t^ in a formula ='" t^ is not an 
instance of an inner formula. Then there are two possibili- 
ties. 

1. There are some occurrences of leafset term algebra op- 
erations /'", f\; or leafset variables u\ in t^ . Here by 
"occurrence" in t^ we mean occurrence that is reachable 
without going through a shape argument or a relation, 
but only through operations V|_, 3|_, a|_, vI_, For ex- 
ample, we ignore the occurrences of /"", fl within terms 
t^ that occur in A^s. 

2. not all shape arguments in V|_, 3l_, aI_, v|_, 
true|_, false |_, r_ occurring in are syntactically iden- 
tical. 

We eliminate the first possibility by propagating leafset 
term algebra operations / , /| inwards until they reach ex- 
pressions of form Lf{t\, . . . ,tn), applying the equations (94) 
from left to right. We then convert / , /j operations of 
term algebra of leafsets into operations of the term algebra 
of terms applying (93) from right to left. 

To eliminate the second possibility, let t\, . . . , t]-^ be the 
occurrences (reachable through true|_,false|_, a|_, vI_, 
V!^, 3!^) in term t"" of the shape arguments of operations 
true!.,false|., a|., v|., V|., 3!.. Then replace 

u"- = t'-{t\,...tl) 

with 

(3=M^ v?'-(m= t\) A' ... A' vS'-(m= t%)^' 

v}- ='- ^'-(^t^...,u=)) V 
(undef A Vi<.<,<„t? /i?) 

Here Vf'" denotes universal quantification Vui,i, . . . , Mi,„; 
where Mi,i, . . . is a list of those term variables occur- 

ring in t\ that are bound by some quantifier 3|_,V|_ within t^. 



6.4 Quantifier Elimination 

In this section we give a quantifier elimination procedure for 
the term-power structure. The procedure of this section is 
applicable whenever C is a structure with a decidable first- 
order theory. 
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Definition 78 below generalizes the notion of structural 
base formula of Definition 41, Section 4.3. There are two 
main differences between Definition 41 and the present Def- 
inition 78. 

The first difference is the presence of three (instead of 
two) base formulas: shape base, leafset base, and term base. 
This difference is a consequence of the distinction between 
leafsets and terms and is needed whenever base structure 
C has more than two elements. There is a homomorphism 
formula relating leafset base formula to shape base formula 
and a homomorphism formula relating term base formula to 
shape base formula. Furthermore, some of the leafset vari- 
ables are determined by term variables using inner formula 
maps, which establishes the relationship between term base 
formula and leafset base formula. Cardinality constraints 
now apply to leafset variables. 

The second difference is the distinction between com- 
posed and primitive non-parameter leafset and term vari- 
ables. A composed non-parameter variable denotes a leafset 
or a term whose shape s has property ls/5(s) for some / G E. 
A primitive non-parameter variable denotes a leafset or a 
term whose shape is (f and has property Ispri or lspR|L . The 
purpose of this distinction is to allow cardinality constraints 
and inner formula maps not only on parameter variables, 
but also on primitive non-parameter variables, which is use- 
ful when the base structure C is decidable but infinite. 

Definition 78 (Structural Base Formula) 
A structural base formula with: 

• /ree term variables xi, . . . , Xm.; 

• internal composed non-parameter term variables 

Ml, ... , Ur; 

• internal primitive non-parameter term variables 

• internal parameter term variables Up+i, . . . , Up+q; 

• free leafset variables Xi, . . . , x'^l; 

• internal composed non-parameter leafset variables 

• internal primitive non-parameter leafset variables 

• internal parameter leafset variables u^l^i, . . . , MpL_|_qL; 

• free shape variables x\, . . . , a;^s; 

• internal non-parameter shape variables u\, . . . ,Ups; 

• internal parameter shape variables Ups, . . . , Ups^qs 
is a formula of form: 

3ui, ...,«„,«!,.. . ,wJ;L,wi, . . . 
shapeBase(wi, . . . , Uns,x\, . . . , x^s) A 
leafsetBase(wi, . . . , m^l, Si, ■ ■ ■ , a;J;,L) A 
leafsetHom(wi, . . . , u^l,u\, . . . , u%s) A 
termBase(Mi, . . . , w„, xi, . . . , Xm) A 
term Horn (mi u„ , m| , ... , u^s ) A 
cardin(M^L_|_i, ■ ■ ■ ,M^L,Mps+i, . . . ,m^s) A 
innerMap(ur-Hi, . . . , Wn, m^l+i, • • • , m^l, Wps+i, • • • , w^s) 



where n = p + q, n'" = p'" + g'", = p^ + , and formu- 
las shapeBase, leafsetBase, termBase, leafsetHom, termHom, 
cardin, innerMap are defined as follows. 

shapeBase(Mi, . . . , u^„s,x\, . . . , x^s) = 

p^ jrr' 

l\ u\=ti{u\,. . . ,u\,s) A f\ xl= Uj. 

i=l i=l 

A distinct(Mi, . . . , w^) 

where each ti is a shape term of form f^{ul^, . . . ,ul^) for 
some f G So, k — ar(/), andj : {1, . . . , nf} {1, . . . , n'} is 
a function mapping indices of free shape variables to indices 
of internal shape variables. 

leafsetBase(Mi, . . . , m^l, a;i, . . . .k^l) = 

A Ui = ti(Wi,...,M^L) A 

i=l 

A ISpRlL(Wi) A 

i=r'-+l 
m'- 

A x\ = u)- 

i = l 

where each ti is a term of form f{ui-^, . . . ,Uif,) for some 
f € T:, k = ar(/), and j : {1, . . . , m'"} {1, . . . , n'"} is a 
function mapping indices of free leafset variables to indices 
of internal leafset variables. 

leafsetHom (wi, . . . ,u\^t_,u\, . . . , <0 = A Issh(Mi) = u]. 

where j : {1, . . . , n'"} {1, . . . , n^} is some function such 
that {ji,. . . ,jp} C {l,...,p"} and {jpL+i, . . . ,ipL+gL} C 
{p^-\-l, . . . ,p'^-\-q'^} (a leafset variable is a parameter variable 
iff its shape is a parameter shape variable). 

termBase(ui, . . . ,w„,a;i, . . . ,Xm) = 

r 

A Ui = ti{ui, . . . ,u„) A 
1=1 
p 

A ISpRl(Mi) A 

i=r-\-l 
m 

A a;j = Uj^ 

i=l 

where each ti is a term of form f{n.,^, . . . , Uij^.) for some 
f € Y:, k = ar(/), and j : {1, . . . , m} — > {1, . . . , n} is a 
function mapping indices of free term variables to indices of 
internal term variables. 

n 

termHom(Mi, . . . , m„, u\, . . . , u^s) = A sh(ui) = m^-^ 

i=l 

where j : {1, . . . ,n} {1, . . . , rf} is some function such 
that {ji, . . . , jp} C {1, . . . ,p"} and {jp+i,. . . ,jp+q} C {p" + 
1, . . . ,p^-|-q^} (a term variable is a parameter variable iff its 
shape is a parameter shape variable). 

cardin(w^L_,_i, . . . ,M^L,Wps+i, . . . ,m^s) = tpi A ■ ■ ■ Atpd 
where each tpi is of form 

|^'"(w^+i,---,'"U)k» = k 
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|t'"(^tl;L+i,...,wU)k= > k 

for some u^-term t'"(u^L_,_i, • • ■ iW^l) that contains no vari- 
ables other than some of the variables m^l_|_i, ■ • • , wJ^l, and 
the following condition holds: 

If a variable u\j for r'" + 1 < j < n'" occurs in 
the term t'"(M^L_|_i, ■ ■ ■ then Issh(M^) = (97) 

occurs in formula leafsetHom . 



innerMap(wr+i, . 
771 A • • • A r?e 

where each rji is of form 

Uj = (t>\u\ui^,...,Ui^) 

for some inner formula tj)^ {u^,Ui-^, . . . ^Ui,,) G Inner(u^) 
where L + 1 < j < n!~ i.e. v}~ is a primitive 
non-parameter leafset variable or parameter leafset vari- 
able, {ui^, . . . ,Uii,} C {ur+i, . . . ,Un} are primitive non- 
parameter term variables and parameter variables, the con- 
junct lssh(w'") = occurs in leafsetHom, and the following 
condition holds: 



sh{ui-) — occurs in formula term Horn for 
every j where 1 < j < k. 



(98) 



We require each structural base formula to satisfy the fol- 
lowing conditions: 

PO) the graph associated with shape base formula 

3u\, . . . , u^s . shapeBase(Mi, . . . , u„s,x\, . . . , a:^s) 

is acyclic (compare to Definition 21); 

PI) congruence closure property for shape^xe subformula: 
there are no two distinct variables u\ and u^j such that 
both ul = f{ul^ , . . . , J and = f{u\^ ,...,u\^) occur 
as conjuncts in formula shapeBase; 

P2) congruence closure property for \eafsetBase subformula: 
there are no two distinct variables u\ and such that 
both u\ = /"-(wz-j , . . . , w^J and u) = /'-(wi'i ,...,u\^) 
occur as conjuncts in formula leafsetBase; 

PS) congruence closure property for term Base subformula. 

j such thai 
, ui^ ) occur 



there are no two distinct variables Ui and Uj such that 
both Ui = f{ui^ ,...,uii^) and uj = fiui^ , 
as conjuncts in formula term Base; 

P4) homomorphism property of Issh; for every non- 
parameter leafset variable w'" such that = 
f'~{u'-^, . . . ,u\^) occurs in leafsetBase, if conjunct 
lssh(it'") — occurs in leafsetHom, then for some shape 
variables u^j^ , • • • , 1*^-^ term — f^{u^j-^ , ■ ■ ■ , u^j,. ) occurs 
in shapeBase where — shapified(/) and for every r 
where 1 < r < k, conjunct Issh(-Ui^) = uj^ occurs in 
leafsetHom . 

P5) homomorphism property o/sh; for every non-parameter 
term variable u such that u — f{ui-^,...,Uii.) oc- 
curs in termBase, if conjunct sh(w) = occurs in 
termHom, then for some shape variables wj^ , . . . , M^-fe 
term = f^{u]j^, . . . ,Wjj.) occurs in shapeBase where 
= shapified(/) and for every r where 1 < r < k, 
conjunct sh(Mi^) = m^,, occurs in termHom. 



As in Section 3.4 and Section 4.3 we proceed to show that 
each quantifier-free formula can be written as a disjunction 
of base formulas and each base formula can be written as 
a quantifier-free formula. We first give a small example to 
illustrate how the techniques of Section 4.3 extend to the 
more general case of S-term-power. 

Example 79 We solve one subproblem from Example 42 
using the language of term-power algebras. 
Consider the formula 



3v. g{v, z) < g{z,v) A ISg(t;) A Is9(w) A 
^(giiw) < giiv)) 



(99) 



(101) 



Formula (99) is in the language of Figure 8, with < a binary 
lifted relation. After converting (99) into the language of 
Figure 9 we obtain as one of the possible cases formula: 

3v. [giv,z) ^ ff(2,u)]sh(g(z,„)) ="- true3h(g(^,„)) A 
sh{g{z,v)) sh{g(z,v)) A 

\sg(v) A Isg(w) A (100) 

[fli(w) ^ 9i{v)U(g,(n,)) ¥=^ true^h(g^(^)) A 

sh(5i(«))== sh(5i(H) 

where ^ is the subtyping relation on the base structure C so 
that < — ^' ■ We next transform the formula into unnested 
form, obtaining: 

^ Uyz ^ Uzv ^ Un,1 , Uyl . 3 Uyz^U^l. 3 Uy^i^wl' 

Uvz~g{v,z) A Uzv ~ g{z,v) A 

"^wi = gi{uj) A = gi{v) A 
ulz =^ sh{uvz) A u^i sh(M„i) A 
sh{Uzv) =' ulz A sh(M„i) =^ M^i A 
\Sg{v) A ISg(w) A 

hUyzlul^ = A 

Uwl ='" [Uwl ^ Uvl]ul^.i A 

We next transform (101) into disjunction of base formulas. 
A typical base formula is: 

3 U^zi'^vi'^zi'^vli'^v2i'^zl->'^z2-)'^wl- 

3 lilyz , Uyj 1 Uyj\ , Um2 • 

shapeBase^ A (102) 
leafsetBasei A leafsetHom i A 
termBasei A termHomi A 
cardini A InnerMapj^ 

shapeBase^ = ulz = g%u%,ul,) Au% = £?'(m^i,w^2) A 
distinct«^, M^, M^l, M^2) 
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leafsetBasei = wi;^, = p'"(m^, m^) A 

u[, = g'-{u[,i,u\,2) A = g'"(w^i,«j2) 
leafsetHomi = lssh(it^2) = ul^ A 

Issh(M^) — u^j A \ssh{u;) = A 
Issh(u^i) = u^i A lssh('u^,2) = A 
Issh(u^i) = it^i A lssh(w^2) = Wto2 a 
Issh(M^i) = It^i 

termBasei = Uvz = g{uv,Uz) A u^v — g{u2,Uy) A 
M„ = gr(u„i,it„2) A = g{uz\,Uz2) A 

= ff(M™l,'Uu,2) A 
= A W = Wto 

termHomi = 

sh(u„^) = A sh(u^„) = A 

Sh(M„) = A S\\{Uz) = U%j /\ s\\{u.w) = u%, /\ 
sh(«„i) = it^i A sh(u2i) = u^i A sh{un,\) = M^i A 
sh(it,,2) = uIj2 a sh(w22) = ul,2 A sh(«,„2) = WL2 
innerMap]^ = 

u[,i ='" ^ A U^i [Uzl ^ ^ti-llu^^j A 

M^2 [l*i'2 ^ "z2]us^2 ^ ='" [«22 ^ W"2]u^2 ^ 

"Mml ='" [^^it'l ^ ^I'llu'^i 

cardini = |^'ul;,i = A hw^i = A 

hMl;2l<2 = A hM^2l«|„2 = A 
h^Ul^J > 1 

We next show how to transform the base formula (102) into 
quantifier-free form. 

We substitute away non-parameter term variables 
Uvz,Uzv,Uv and non-parameter leafset variables u\,2,u\,,u^, 
because the homomorphism constraints they participate in 
may be derived from the remaining conjuncts. We next elim- 
inate parameter term variables u„i,w„2 and parameter leaf- 
set variables , 'u^,2, u^i, ^^21 w^i- Grouping the conjuncts 
in cardini and innerMap^ by their shape, we may extract the 
subformulas ■01 and ip2 of (102). 

ipi = 

Sh(M^l) =^ M^i A Sh(M2l) =^ M^i A Sh(M^l) M^i A 

lssh(Mj;i) =^ M^i A Issh(M^i) u^i A 
Issh(M^i) u%,-^ A 

M^l ='" \Uvl < Uzl\u=^^ A Mji ='" [Uzl -< Uvl\u%,^ A 
W™1 ='" [Uwl < Uvl]u\^^ A 

hwt:.il<i = A hwziU'^, = A 
hw^il^J > 1 



and 

i>2 = 

3Uv2-^^u\,2,u),2. 
Sh(u„2) =' W^2 A sh(Uz2) =^ ML2 A 

sh(u^2) uli,2 A sh(u^2) '"L2 A 

«u2 [Uv2 ^ ^*z2]us^2 ^ [Uz2 ^ Mt>2]u^2 ^ 

hMb2|<2 = A hWz2|<2 =0 

Formula ipi expresses a fact in a structure isomorphic to 
the power C" where n is the number of leaves in the shape 
denoted by Similarly, i/>2 expresses a fact in a prod- 

uct structure C™ where m is the number of leaves iu the 
shape denoted by 11^2- We can therefore use the technique 
of Feferman-Vaught technique (Section 3.3) to eliminate the 
quantifiers from formulas Vi and '02 ■ According to Exam- 
ple 17, ■01 is equivalent to: 

■iio [3't. t < Uzl a' Uzl ^ t a' Uwi -< A 

u\ [3't. t < Uzl a' Uzl -<t a} -i'mu,! < t\u%^^ A 

|m4|<i > 1 a h'w^ a' -I'tiikLi = 

We similarly apply Feferman-Vaught construction to i/>2 and 
obtain the result true. We may now substitute the results of 
quantifier elimination in xpi and ■02. The resulting formula 
is: 

3m„z , Uzv ,Uv,Uz,Uw,Uyi, Uv2 ,Uzl, Uz2 , Mml , Wto2 . 
3 U„.^,U^,U^,U„i,Uj,2,U^i,U;,2,U^i. 
3 Uyz ) 5 ^wl^ ^w2 • 

shapeBasej A 
leafsetHom2 A 
termBase2 A termHom2 A 
cardini A innerMapj 

where 

leafsetHom2 = Issh(wo) = uXji A lssh('U4) = w^i 
termBase2 = Uz = g{uzi,Uz2) A Uw = g{uwi,Uw2) A 

z = Uz A w = Uw 

innerMap2 = 

)io ='" [3't. t ^ Uzl a' Uzl :<t a' u^i ^ i]„s^^ A 

U4 ='" [3't. t ^ Uzl a' Uzl <t a' -''uwi ^ A 

cardin2 = |w4Uj„i > 1 A |-i'mo a' -i'm4|uS,j — 

In the resulting formula all variables are expressible in terms 
of free variables, so we can write the formula without quan- 
tifiers 3,V,3SV'-. 

♦ 

The following Proposition 80 is analogous to Proposi- 
tion 44; the proof is straightforward. 
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Proposition 80 (Quantification of Struct. Base) ///3 
is a structural base formula and x a free shape, leafset, or 
term variable in 13, then there exists a base structural for- 
mula 01 equivalent to 3x.p. 

The following Proposition 81 corresponds to Proposition 45. 

Proposition 81 (Quantifier-Free to Structural Base) 

Let S be a well-defined simple formula without quantifiers 
3'",V , 3,V, 3^,V"^. Then (j) can be written as true, false, or 
a disjunction of structural base formulas. 

Proof Sketch. The overall idea of the transformation to 
base formula is similar to the transformation in the proof of 
Proposition 45. Additional complexity is due to inner formu- 
las. However, note that an inner formula (j){u^,ui, . . . ,m„) 
is well-defined iff S{u^, ui, . . . , Un) holds where 

S{u,ui,...,Un) = sh(Mi)=M^ A... A sh(u„) = 

Hence, each formula </>(w^,«i, . . . ,««) can be treated as a 
partial operation p of sort 

shape X term" leafset 

and the domain given by 

Dp = {{u^,ui, . . . ,Un),S{u^,Ul, . . .,u„)) 

This means that wc may apply Proposition 9 and convert 
formula to disjunction existentially quantified well-defined 
conjunctions of literals in one of the following forms: 

1. equality with inner formulas: uq ='" (f>{u^,ui, . . . ,Un) 
where 4>{u^ , ui , . . . , Un) is a u^-irmcr formula; 

2. formulas of leafset boolean algebra: 

Wo Wl AIjS U2 

u\ wis 

Wo ='■ truel^s 
Wo false^s 

3. formulas of term algebra of terms: 

Wi = W2, Wi / W2 
Wo = /(wi, . . . ,W„) 
U = fi{uo) 

Is/(mo), -'Is/(mo) 
sh(M) = w^ 

4. formulas of term algebra of leafsets: 

Wl W2, Wi W2 

uo ='■ /'"(mi, ■■■,ui) 

= /[(Ho) 
IS/L(wh), -'ISyL(Wo) 

lssh(w'-) u' 



5. formulas of term algebra of shapes: 

wi =^ wl, Ml Ul 

wS== f%u\,...,ul) 

=^ /l(«o) 
Is/^Mo), ^\sp{uo) 

We next describe transformation of each existentially 
quantified conjunction. In the sequel, whenever we perform 
case analysis and generate a disjunction of conjunctions, ex- 
istential quantifiers propagate to the conjunctions, so we 
keep working with a existentially quantified conjunction. 
The existentially quantified variables will become internal 
variables of a structural base formula. 

Analogously to the proof of Proposition 28, we use 
(90), (91), (16) to eliminate literals -ils/(wo), -iIs^l/'"(mo): 

As in the proof of Proposition 45, we replace formulas of 
leafset boolean algebra by cardinality constraints, similarly 
to Figure 7. 

We next convert formulas of term algebra of terms into 

a base formula, formulas of term algebra of leafsets into a 
base formula, and forrrmlas of term algebra of shapes into a 
base formula. 

We simultaneously make sure that every term or leafset 
variable has an associated associated shape variable, intro- 
ducing new shape variables if needed. 

We also ensure homomorphism requirements by replac- 
ing internal variables when we entail their equality. 

Another condition we ensure is that parameter term vari- 
ables map to parameter shape variables, and non-parameter 
term variables to non-parameter shape variables; wc do this 
by performing expansion of term and shape variables. 

We perform expansion of shape variables as in Sec- 
tion 3.2. Expansion of term and variables is even simpler 
because there is no need to do case analysis on equality of 
term variable with other variables. 

Wc eliminate disequality between term variables us- 
ing (92). We eliminate disequalities between leafset vari- 
ables as in Example 43, by converting each discquality into 
a cardinality constraint. Elimination of disequalities might 
violate previously established homomorphism invariants, so 
we may need to reestablish these invariants by repeating the 
previously described steps. The overall process terminates 
because we never introduce new inequalities between term 
or leafset variables. 

As a final step, we convert all cardinality constraints into 
constraints on parameter term variables, using (95). 

In the case when the shape of cardinality constraint is c^, 
we cannot apply (95). However, in this case, unlike Propo- 
sition 45, wc do not do case analysis on all possible constant 
leafsets (this is not even possible in general). This is because 
Definition 78, unlike Definition 41 implies no need to further 
decompose cardinality constraints in that case, because we 
allow primitive non-parameter leafset variables. 

This completes our sketch of transforming a quantifier- 
free formula into disjunction of structural base formulas. ■ 

Wc introduce the notion of determined variables in struc- 
tural base formula generalizing Definition 29 and Defini- 
tion 46. 

For brevity, we write w* for internal shape, term, or leaf- 
set variables, similarly x* for a free variable, t* for a term 
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and /* for a shape, term, or leafset term algebra constructor 
and /* for a shape, term, or leafset term algebra selector. 

Definition 82 The set dets of variable determinations of a 
structural base formula P is the least set S of pairs (u* ,t*) 
where u* is an internal term, leafset, or shape variable and 
t* is a term over the free variables of (5, such such that: 

1. if X* = u* occurs in termBase, leafsetBase, or 
shapeBase, then {u*,x*) e S; 

2. if {u*,t*) G S and u* = f*{ul,...,ul) oc- 
curs m shapeBase, termBase, or leafsetBase then 

{{kL A* (r )),...,«, A* (r))}cs; 

3. if {{nijnn),---A<Jk{n)} C S and u* = 
f*{ul, . . . ,ul) occurs in shapeBase, termBase, or 
leafsetBase then {u* ,t*) £ S; 

4- if {u,t) G S and sh{u) = occurs in termHom then 
«sh(t))€S; 

5. if {v}~,t'~) e S and lssh(u'") = occurs in leafsetHom 
then {u\ lssh(t'-)) e S; 

6. if = ui, . . . , M„) occurs in innerMap 
where 0(u^, mi, . . . , «„) is an inner formula 
and {{u^,f),{ui,ti), . . . ,{un,tn)} C S, then 
(u^,4>{f,ti,...,tn)) € S. (In the special case 
when 4> contains no free term variables, if {u^,t^} € S 
then {mS0(w')> € S. 

Definition 83 An internal variable u* is determined if 
(w*,t*) £ dets for some term t'^. An internal variable is 
undetermined if it is not determined. 

Lemma 84 Let (3 be a stnictiiral base formula with matrix 
f3o and let dets be the determinations of 13. If {u*,t*) £ S 
then \= ^u* =t*. 

Proof. By induction, using Definition 82. ■ 

Corollary 85 Let (3 be a structural base formula such that 
every internal variable is determined. Then f3 is equiva- 
lent to a well-defined formula without quantifiers 3'", V'", 3,V, 

Proof. By Lemma 84 using (7). ■ 

Lemma 86 Let u be an undetermined composed non- 
parameter term variable in a structural base formula [3 such 
that u is a source i. e. no conjunct of form 

U = f{ui, . . . ,U,. . . ,Uk) 

occurs in termBase. Let P' be the result of dropping u from 

,3. Then 13 is equivalent to (3' . 

Proof. Because w is a composed non-parameter term 
variable, it does not occur in innerMap, so it only occurs 
in termBase and termHom. The conjunct containing u in 
termHom is a consequence of the remaining conjuncts, so it 
may be dropped. After that, applying (7) yields a structural 
base formula (3' not containing u, where (3' is equivalent to 
13. m 



Lemma 87 Let be an undetermined composed non- 
parameter leafset variable in a structural base formula (3 such 
that u'~ is a source i. e. no conjunct of form 

u = f (Ui,...,U ,...,Uk) 

occurs in leafsetBase. Let (3' be the result of dropping v!~ 
from (3. Then [3 is equivalent to f3' . 

Proof. Because is a composed non-parameter term 
variable, it does not occur in innerMap or cardin, so it only 
occurs in leafsetBase and leafsetHom. The conjunct con- 
taining m'- in leafsetHom is a consequence of the remaining 
conjuncts, so it may be dropped. After that, applying (7) 
yields a structural base formula /?' not containing m'", where 
(3' is equivalent to (3. ■ 

Corollary 88 Every base formula is equivalent to a base 
formula without undetermined composed non-parameter 
term variables and without undetermined composed non- 
parameter leafset variables. 

Proof. If a structural base fornmla has an undetermined 
composed non-parameter term variable, then it has an un- 
determined composed non-parameter term variable that is 
a source, similarly for leafset variables. By repeated appli- 
cation of Lemma 86 and Lemma 87 we eliminate all unde- 
termined non-parameter term and leafset variables. ■ 

The following Proposition 89 corresponds to Proposi- 
tion 53 and Proposition 66. 

Proposition 89 (Struct. Base to Quantifier- Free) 

Every structural base formula (3 is equivalent to a well- 
defined simple formula cj) without quantifiers 3'",V'", 3,V, 
3=,V=. 

Proof Sketch. By Corollary 88 we may assume that 
P has no undetermined composed non-parameter term and 
leafset variables. By Corollary 85 we are done if there are 
no undetermined variables, so it suffices to eliminate: 

1. undetermined parameter term variables, 

2. undetermined primitive non-parameter term variables, 

3. undetermined parameter leafset variables, 

4. undetermined primitive non-parameter leafset vari- 
ables, and 

5. undetermined shape variables. 

If u is an undetermined parameter term variable or a prim- 
itive non-parameter term variable, then u does not occur in 
termBase, so it occurs only in termHom and innerMap. If 
m'" is am undetermined parameter leafset variable or a prim- 
itive non-parameter leafset variable then does not occur 
in leafsetBase, so it occurs only in leafsetHom, innerMap, and 
cardin. 

For a undetermined term or leafset variable of shape 
such that there is an uncovered parameter or primitive non- 
parameter term or leafset variable with shape u^, consider 
all conjuncts 7i in innerMap of form 
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and all conjuncts 5i from cardin of form: 

|t'"(M^+l,---,MtiL)k» = k 

or 

|i'"(Ml;L+i, . . . ,w^l)U- > k 

Together with formulas from termHom and leafsetHom that 
contain term and leafset variables free in formulas 7i and 5i, 
these conjuncts form a formula r? which expresses a relation 
in the substructure of term-power algebra which (because 
constructors are covariant) is isomorphic to a term-power 
of C. We therefore use Feferman-Vaught theorem from Sec- 
tion 3.3 to eliminate all term and parameter variables from 
rj. By repeating this process we eliminate all undetermined 
parameter and Icafsct variables. 

It remains to eliminate undetermined shape variables. 
This process is similar to term algebra quantifier elimina- 
tion in Section 3.4. An essential part of construction in 
Section 3.4 is Lemma 25, which relies on the fact that unde- 
termined parameter variables may take on infinitely many 
values. We therefore ensure that undetermined parameter 
shape variables are not constrained by term and parame- 
ter variables through conjuncts outside shapeBase. An un- 
determined parameter shape variable docs not occur in 
termHom or leafsetHom because there are no parameter term 
and leafset variables, so w' can occur only in innerMap and 
cardin. 

liowcvor, because undetermined parameter and leafset 
variables axe eliminated from the formula, if is a parame- 
ter shape variable then exactly one of these two cases holds: 

1. there are some conjuncts in innerMap and cardin that 
contain and contain some determined term and leaf- 
set variables, in this case is determined, or 

2. there are no conjuncts in innerlVlap containing and 
cardin contains only domain cardinality constraints of 
form |l|us = k and \l\u^ > k. 

Hence, if is a shape variable it remains to eliminate the 

constraints of form |1|„5 = k and |l[us > fc. Wc eliminate 
these constraints as in the proof of Proposition 66. 

In the resulting formula all variables arc determined. By 
Corollary 85 the formula can be written as a formula without 
quantifiers 3SvS 3,V, 3^V^ ■ 

The following is the main result of this paper. 

Theorem 90 (Term Power Quant. Elimination) 

There exist algorithms A, B such that for a given formula 
<p in the language of Figure 9: 

a) A produces a quantifier-free formula 4>' in selector lan- 
guage 

h) B produces a disjunction 4>' of structural base formulas 
We also explicitly state the following corollary. 

Corollary 91 LetC he a structure with decidable first- order 
theory. Then the set of true sentences in the logic of Figure 9 
interpreted in the structure V according to Figures 10 and 
11 is decidable. 



6.5 Handling Contravariant Constructors 

In this section we discuss the decidability of the E-term- 
power structure for a decidable theory C when some of the 
function symbols / G S arc contravariant. We then sug- 
gest a generalization of the notion of variance to multiple 
relations and to relations with arity greater than two. 

The modifications needed to accommodate contravari- 
ance with respect to some distinguished relation symbol 
<G il for the case of infinite C are analogous to the modifi- 
cations in Section 5.5. We this obtain a quantifier elimina- 
tion procedure for any decidable theory C in the presence of 
contravariant constructors. 

Theorem 92 (Decidability of Structural Subtyping) 

Let C he a decidable structure and V a Y^-term-power of C. 
Then the first-order theory of V is decidable. 

In the rest of this section we consider a generalization 
that allows defining variance for every relation symbol r € R 
of any arity, and not just the relation symbol <£ R. 

For a given relation symbol r € R, function symbol 
/ € S, with k = ar(/), and integer i where 1 < i < k, 
let Pr{f, i) denote a permutation of the set {1, . . . , fc} that 
specifics the variance of the i-th argument of / with respect 
to the relation r. For example, if r is a binary relation then 
Pr{f,i) is the identity permutation {(1, 1)(2,2)} if t-th ar- 
gument of / is covariant, or a the transpose permutation 
{(1, 2), (2, 1)} if i-th argument of / is contravariant. 

If I € leaves(s) is a leaf I = (f\i^) . . . {f" ,i''), define the 
permutation variance(Z) as the composition of permutations: 

variance(/) = P,(/", i") o • ■ ■ o P,(/\ i^) 
Then define |r| by 

Mis,ti,...,tk) = 
{s,{l\ [rf(t„ [«],..., WW) A 
(pi, . . . ,pk} = variance(Z) 

}> 

We generalize (76) by defining 

iV^(s) = \{l € leaves(s) | variance(Z) = 7r}| 

As in Section 5.5, we can transform the constraints 
|l|„s = k and |l|„s > fc on each parameter shape variable 
into a conjunction of constraints of form: 

N^{u') = k 

or 

N^{u') > k 

A problem on nonnegative integers. To solve the 
problem of variance with any number of relation symbols of 
any arity, it suffices to solve the following problem on sets 
of tuples of non-negative integers. 

Let Nat — {0,1,2,...}. Consider the structure St = 
Nat'' for some d > 2 and let D = {l,2,...,d}. If p is a 
permutation on D, let Mp denote an operation St — » St 
defined by 

-^p(^l, ■ ■ ■ , ^d) — (^pi 5 ■ • • : ^Pd) 
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If {xi, Xd), (2/1, ■ ■ ■ , yd) € St define 

{xi,...,xd) + {yi,---,yd} = {xi + yi,...,xd + yd) 

Consider a finite set of operations / : St'^ — » St where each 
operation / is determined by k permutations p( , ■ ■ ■ , p{ in 
the following way: 

f{h ,...,tk)=M f{ti) + ... + M f{tk) 

Hence, each operation / of arity k is given by a permutation 
which specifies how to exchange the order of arguments in 
the tuple. After permuting the arguments the tuples are 
summed up. 

Given a finite set F of operations /, let S be the set 
generated by operations in F starting from the element 
(1, 0, . . . , 0) G St. Lot C{ni, . . . , Ud) be a conjunction of 
simple linear constraints of the forms 

Ui = ai 

and 

rii > ai 

Consider the set 

Ac = {(m, . ..,nd)e S\C{ni,. . .,nd)} 

The problem is: For given sot of operations F, is there an 
algorithm that given C(ni, . . . , Ud) finitely computes the set 
Ac. 

End of a problem on nonnegative integers. 

We conjecture that the technique of Lemma 68 can be 
generalized to yield a solution to the problem on nonnegative 
integers and thus establish the decidability for the notion of 
variance with respect to any number of relations with any 
number of arguments. 

6.6 A Note on Element Selection 

We make a brief note related to the choice of the language 
for making statements in term-power algebras. In Section 5 
we avoided the use of leafset variables by substituting them 
into cardinality constraints. In this section we use a cylindric 
algebra of leafsets. 

An apparently even more flexible alternative is to allow 
the element selection operation 

select :: term x leaf — » elem 

where elem is a new sort, interpreted over the sot C, and 
leaf is a sort interpreted over the set of pairs of a shape and 
a leaf. Instead of the formula 

ru^{ti, ...,*„)="■ true^s 

we would then write 

V/. rus(select(ti, Z), . . . , select(t„, I)) ='" true^s 

Using select operation we can define update relation: 

update(ti, Zo, e, t2) = 
VZ. {{I = lo A select(t2, /) = e) V 

{l^lo A select(t2, = select(ti, I))) 



The resulting language is at least as expressive as the lan- 
guage in Figure 5. This language is interesting because it 
allows reasoning about updates to leaves of a tree of fixed 
shape, thus generalizing the theory of updatablc arrays [33] 
to the theory of trees with update operations, which would 
be useful for program verification. We did not choose this 
more expressive language in this report for the following 
reason. 

If the base structure C has a finite domain C, then for 
certain reasonable choice of the relations interpreting Lc it 
is possible to express statements of this extended language 
in the logic of Figure 9. The idea is to assume a partial order 
on the elements of C with a minimal element, and use terms 
t with exactly one leaf non-minimal to model the leaves. 

On the other hand, in the more interesting case when C 
is infinite, we can easily obtain undecidable theories in the 
presence of selection operation. Namely, the selection oper- 
ation allows terms to be used as finite sets of elements of C. 
The term-power therefore increases the expressiveness from 
the first-order theory to the weak monadic second-order the- 
ory, which allows quantification over finite sets of objects. 
Weak monadic theory allows in particular inductive defini- 
tions. If theory of structure C is decidable, weak monadic 
theory might therefore still be undecidable, as am example 
we might take the term algebra itself, whose weak monadic 
theory would allow defining subterm relation, yielding an 
undecidable theory [55, Page 508]. 

7 Some Connections with MSOL 

This section explores some relationships between the the- 
ory of structural subtyping and monadic second-order logic 
(MSOL) interpreted over tree-like structures. We present 
it as a series of remarks that are potentially useful for un- 
derstanding the first-order theory of structural subtyping of 
recursive types, see [36, 37] for similar results in the context 
of the theory of feature trees. 

In Section 7.1 we exhibit an embedding of MSOL of tn- 
finite binary tree into the first-order theory of structural 
subtyping of recursive types with two constant symbols a,b 
and one covariant binary function symbol /. MSOL of infi- 
nite binary tree is decidable. Although the embedding does 
not give an answer to the decidability of the structural sub- 
typing of recursive types, it does show that the problem is at 
least as difficult as decidability of MSOL over infinite trees. 
We therefore expect that, if the theory of structural subtyp- 
ing of recursive types is decidable, the decidability proof will 
likely either use decidability of MSOL over infinite trees, or 
use directly techniques similar to those of [18, 56]. 

In Section 7.2 we use the embedding in Section 7.1 to 
argue the decidability of formulas of the first-order theory 
of structural subtyping of recursive types where variables 
range over terms of certain fixed infinite shape Se. 

In Section 7.3 we present an encoding of all terms using 
terms of shape Se. We argue that the main obstacle in us- 
ing this encoding to show the decidability of the first-order 
theory of structural subtyping recursive types is inability to 
define the set of all prefix-closed terms of the shape Se- 

In Section 7.4 we generalize the decidability result of Sec- 
tion 7.2 by allowing different variables to range over different 
constant shapes. 

In Section 7.5 we illustrate some of the difficulties in 
reducing first-order theory of structural subtyping to MSOL 
over tree-like structures. We show that if we use a certain 
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form of infinite feature trees instead of infinite terms, the 
decidability follows. 

In Section 7.6 wc point out that monadic second-order 
logic with prefix-closed sets is undecidable, which follows 
from [48]. This fact indicates that if we hope to show the 
decidability of structural subtyping of recursive types, it is 
essential to maintain the incomparability of types of different 
shape. 

7.1 Structural Subtyping Recursive Types 

In this section we define the problem of structural sub- 
typing of recursive types. We then give an embedding of 
MSOL of the infinite binary tree into the first-order theory 
of structural subtyping of infinite terms over the signature 
E = {a, 6, <?} with the partial order <. 

We define MSOL over infinite binary tree [6, Page 317] 
as the structure MSOL^^' — ({0, 1}*, succo, succi). The do- 
main of the structure is the set {0, 1}* of all finite strings 
over the alphabet {0, 1}. We denote first-order variables by 
lowercase letters such as x,y,z. First-order variables range 
over finite words w € {0, 1}*. We denote second-order vari- 
ables by uppercase letters such as X, Y, Z. Second-order 
variables range over finite and infinite subsets S C {0, 1}*. 
The only relational symbol is equality, with the standard in- 
terpretation. There are two function symbols, denoting the 
appending of the symbol and the appending of the symbol 
1 to a word: 

SUCCo w = w • 

SUCCi w = w ■ 1 

For the purpose of embedding into the first-order theory 
of structural subtyping, we consider a structure MSOL^^' = 
({0, 1}*, ^, Succo, Succi) equivalent to MS0L'^^\ We use the 
language of MSOL without first-order variables to make 
statements within MSGL'-'^-'. ^ is a binary relation on sets 
denoting the subset relation: 

Fi g Fa <S=^ Vs. a; € Yi a; € ^2 

Succo and Succi are binary relations on sets, Succo, Succi C 
2{o,i}* ^ 2{0'i>* , defined as follows: 

Succ„(yi, Ya) Y2 = {w-0\weYi} 

Succi(yi,y2) <^ Y2 = {w-l\w€Yi} 

The structure MSGL^^-* is similar to one in [18]; the dif- 
ference is that relations Succo and Succi are true even for 
non-singleton sets. 

Lemmas 93 and 94 show the expected equivalence of 
MS0L(2) and MSOL^i'. 

Lemma 93 (MSGL'*^' expresses MSOL*^') Every rela- 
tion on sets definable in MSOL^'^^ is definable in MSOL^^'. 

Proof. We express relations C, Succo, Succi as formulas 
in MSOL^^\ as follows. We express Yi C Y2 as 

Vx. Yi{x)^Y2{x), 

Succo(yi,V2) as 

\/x.Y2{x) 3y.y = SUCCo (x), 



and Succi(yi,l2) as 

Va;.y2(a;) 3j/.j/ = succi(x). 

The statement follows by induction on the structure of for- 
mulas. ■ 

Let R C (2<o.i}*)fe X ({0,1}*)" be relation of arity k + n. 
Define R* C (2^0'i>*)'= x (2^°'i>*)" by 

R*{Yi,...,Yk,Xi,...,X„) = 

3x1, ■■■,Xn- X\ = {xi} A • • • A X„ = {x„} A 

R{Yl, ...,Yk,Xl,.. .,Xn) 

Lemma 94 (MSOL'^' expresses MSGL'^)) If R is defin- 
able in MSGL^^', then R" is definable in MSGL'^^ 

Proof Sketch. Property of being an empty set is definable 
in MSGL'^^ by the formula 

MYi) = yY2.Yi g Y2 

The relation C of being a proper subset is definable in 
MSGL^i) by formula 

</.i(yi, yj) = yi g y2 A yi y2 

and the relation Ci of having one element more is definable 
by formula 

<A2(yi, y2) = Y1CY2A -Bz. Yi c z A z c y2 

The property of being a singleton set can then be expressed 
by formula 

<?!.3(yi) = 3yo. 0o(yo) Ay, ci n 

We define the relation on singletons corresponding to succo 
by 

^4(yi,y2) = </.3(yi) A03(y2)ASucco(yi,y2) 

Similarly, the relation corresponding to succi is defined by 

^!.6(yi,y2) = cj>3{Yl) A(t>3{Y2) ASuCCl{Yl,Y2) 

If R is expressible by some formula V' in MSGL^^\ then R is 
expressible by a formula in prenex normal form, so suppose 
■0 is of form 

QiVi . . . QnKi-V'o 

where V'o is quantifier free. We construct a formula tp' ex- 
pressing 7?* in MSGL^^'. We obtain the matrix i/iq of ip' 
by translating i1;q as follows. If x is a first-order variable in 
ipo, wc represent it with a second-order variable X denot- 
ing a singleton set. We replace membership relation Y{x) 
with subset relation X dY . We replace succo with (f)4, and 
succi with 05. We construct xp' by adding quantifiers to 
V'o as follows. Second-order quantifiers remain the same. 
First-order quantifiers are relativized to range over single- 
ton sets: ^Ix.tpi becomes yX.4>3{X) tp'^ and 3x.tpi becomes 
3X. MX)A',pi{X). m 

We can view as a first-order structure with the 

domain 2^°'^^*. We show how to embed MSGL^^^ into the 
first-order theory of structural subtyping. 
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We define the first-order structure of structural subtyp- 
ing of recursive types similarly to the corresponding struc- 
ture for non-recursive types in Section 4; the only difference 
is that the domain contains both finite and infinite terms. 
Infinite terms correspond to infirnte trees [12, 30]. 

We define infinite trees as follows. We use alphabet r} 
to denote paths in the tree. A tree domain -D is a finite or 
infinite subset of the set {l,r}* such that: 

1. D is prefix-closed: if w € {l,r}*, x 6 {l,r} then 
w ■ X € D implies w € D; 

2. if w £ D then exactly one of the following two proper- 
ties hold: 

(a) w is an interior node: {w ■ 1,11) ■ r} C D 

(b) w is a leaf: {w ■ l,w ■ r} D D = <l>. 

A tree with a tree domain D is a total function T from the 
set of leaves of D to the set {a, b}. 

Note that the tree domain D of a tree T can be recon- 
structed from T as the prefix closure of the domain of the 
graph of function T; we write TDom(T) for the tree domain 
of tree T. 

Two trees are equal if they are equal as functions. Hence, 
equal trees have equal function domains and equal tree do- 
mains. 

We say that Ti < T2 iff TDom(Ti) = TDom(T2) and 
Ti(w) <o T2{w) for every word w £ TDom(Ti). Here <o is 
the relation {{a, a), (a, b), {b, b)}. 

If Ti and T2 are trees, then g(Ti, T2) denotes the tree T 
such that 

TDom(T) = {/ • w I «; G Ti} U {r • w I w € T2} 

T{l-w) =Ti{w), ifw€Ti 

T(r • w) = T2(w), if w € T2 

Let IT denote the set of all infinite trees. The structural 
subtyping structure is the structure SIT = {\T, g,a,b, <). 
SIT is an infinite-term counterpart to the structure BS from 
Section 4. 

Similarly to the case of finite terms, define the relation 
~ of "being of the same shape" in SIT by 

tl ~ t2 = 3*0. to <ti/\ ti, < t2 

Observe that ti ^ t2 iff TDom(ti) — TDom(t2). 

We next present an embedding l of MSOL^^' into SIT. 
The image of the embedding t are the infinite trees that are 
in the same ~-equivalence-class with the tree te- We define 
te as the unique solution of the equation: 

*e = g{g{te,te),a) 

Trees in the ^-equivalence class of te have the tree domain 
D = TDom(te) given by the regular context-free grammar 

D^e\r\l\lrD\llD 

whereas the leaves LofD are given by the context-free gram- 
mar 

L-^ e\r\lrL\UL 

or the regular expression {lr\U)*r. Let h be the homomor- 
phism of words from {0, 1}* to {/, r}* such that 

h{Q) = II 
h{l) = Ir 



If ™ = ai . . . a„ is a word, then denotes the reverse of 

the word, w^' = an ■ ■ ■ ai- 

We define the embedding t to map a set Y C {0, 1}* into 
the unique tree t such that t ^ te and for every w G {0, 1}*, 

weY ^ T{h{w^) ■r)=b (103) 

Observe that t(0) = te- Define formulas TSucco(ti,t2) and 
TSucci(ti,t2) as follows: 

TSucco(ti, t2) = t2 = g{g{ti,te),te) 
TSuCCl(tl,i2) = t2 = g{g{te,ti),te) 

It is straightforward to show that b is an injection and that 
i maps relation C into <, relation Succo into TSucco, and 
relation Succi into TSucci. Moreover, the range of i is the 
set of all terms t such that sh(i) — Se where Se = sh(te). 

7.2 A Decidable Substructure 

Section 7.1 shows that terms of shape Se form a substructure 
within SIT that is isomorphic to MSOL^^^. In this section 
we consider the following converse problem. 

Consider the formulas BT that, instead of quantifiers 
3,V, contain bounded quantifiers 3e,Ve that range over the 
elements of the set 

Te = {t\ Sh(i) = Se} 

We show that the set of closed formulas from BJ- that are 
true in SIT is decidable. 

Although the quantifiers are bounded, terms in this logic 
can still denote elements of shape other than Se- For exam- 
ple, the in the atomic formula 

g{xi,x2) < g{x3,g{g{x4,x5),b)) 

the term g{xi,X2) denotes a term of the shape g^{se,Se). 
First we show that all atomic formulas are of one of the 
following forms: 

1. xo = g{g{xi,X2),a); 

2. Xo = g{g{xi,X2),b); 

3. xi = X2; 

4. Xl < X2. 

Consider an atomic formula ti = t2. The key idea is that if 
sh(fi) ^ sh(t2) then the formula ti = t2 is false. 

If none of the term ti and t2 is a variable then one of them 
is a constant or a constructor application. If ti = 5(^11,^12) 
then either ti = t2 is false or t2 = 5(^21, ^22) for some i2i, ^22- 
We may therefore decompose ti = t2 into tn = t2i and 
ti2 ~ t22- By repeating this decomposition we arrive at 
terms of form ti — t2 where both ti and t2 are constants or 
at the equality of form xo = t{xi, . . . ,x„). The equalities 
between the constants can be trivially evaluated. This leaves 
only terms of form xo = t{xi, . . . , x„). Let t^{x\, . . . , a:^) be 
a shape term that results from replacing a and b with d and 
replacing g with g^ in t. Because all variables range over Te, 
we conclude that xo = t{xi, . . . , Xn) can be true only if 

Se — t (Se, • • • , Se) 
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If t{xi,...,Xn) G {a,b} is then (7.2) is false. If 
t(xi, . . . ,Xn) = xi, wc obtain formula of the desired form. 
So assume t{xi,...,Xn) = g{t2i,t22)- Then sh(f2i) = 
g^'isc-Se) and sh(t22) = c^ Therefore, t2i = (7(t2ii, ^212) 
where either sh(t2ii) ~ sh(f2i2) = Se or ti = t2 is 
false. Similarly, either t22 G {a, b} or ti = t2 is false. 
Therefore, t{xi, x„) = g{g{t2ii,t2i2),a), t{xi, . . . ,x„) = 
5(5(*2ii,i2i2),&), or ti = t2 is false. If t(a::i, . . . , a;„) = 
g{g{t2ii,t2i2),a) then we may replace the ti = t2 with the 
formula 

3eyi,y2- xo = g{g{yi,y2),a) A t/i = i2ii A 3/2 = ^212 

and similarly in the other case. By contirming this process 
by the induction on the structure of the term t{xi, . . . , x„) 
we either conclude that ti = t2 is false, or we conclude that 
ti = t2 is equivalent to a conjunction of formulas of the 
desired form. 

Conversion of atomic formula of form ti < t2 is analogous 
to the conversion of formuleis ti — t2. 

To see the decidability it now suffices to convert 
the formulas of the form 2:0 ~ g{g{xi,X2),a) and 
^0 ~ g{g{xi, X2),b) into formulas TSucco(ti, t2) and 
TSucci(ii, i2). Expressibility of xo = g{g{xi,X2),a) fol- 
lows from the fact that the following relationship between 
Xq, Xi, X2 is expressible in MSOL: 

Xo = {wO\we Xi} U {w • 1 I w € X2} 

Similarly, the expressibility of xo = g{g{xi,X2),b) follows 
from the fact that 

Xq^{wO\w e Xi} U {w ■ I I w e X2} U {e} 

is expressible in MSOL. We conclude that the set of closed 
BJ^ formulas that are true in SIT is decidable. 

7.3 Embedding Terms into Terms 

Wc next give an embedding of the set of all terms into Ts. 
As in Section 7.1 te be the unique solution of the equation 
te = g{g{te,te),a) and let 

t4{xi,X2,X3,X4,) = g{g{g(g{xi,X2),X3),te),X4) 

Define 

ta = t4{te, te, tt, o) 

tb = ti{te,te, a, 6) 

^9(2:1, 2:2) = tA{Xx,X2,b, b) 

Then define the homomorphism hx from the set of all terms 
to the set Te by 

/it(o) = ta 
/it (6) = tb 

hT{g(tut2)) = tg{hT{tl),hT{t2)) 

Then hr is embedding of the set of all terms into the subset 
subset Te of all terms. The term algebra operations a, 6, g 
map to ta,tb,tg and < maps to <. 

Note that, if it were possible to define a predicate P{t) 
such that 

P{x) <s=^ 3y.hT{y)=x (104) 



then we could express all statements of SIT within the BJ- 
subthcory, and therefore SIT would be decidable. 

The fundamental problem with specifying P{x) is not the 
use of two bits to encode the three possible elements {a, b, g}, 
but the constraint that if a term contains a subterm of the 
form t4{ti, t2,a, a) or t4{ti,t2, a, b) at some even depth, then 
ti = t2 = te- Compared to the relationships given by con- 
structor g, this constraint requires talcing about successor 
relation at the opposite side of the paths within a tree, see 
Section 7.6. 

7.4 Subtyping Trees of Known Shape 

Wc next argue that if wc allow the logic to have a copy of 
bounded quantifiers 3s, Vs for every constant shape s, we 
obtain a decidable theory. To denote constant shapes in a 
finite number of symbols we consider in addition to term 
algebra symbols g^, (f the expressions that yield solutions of 
mutually recursive equations on shapes; the details of the 
representation of types are not crucial for our argument, see 
e.g. [12] ^ 

Consider a closed formula in such language. Because ev- 
ery variable has an associated constant shape, we can com- 
pute the set of all shapes occurring in the formula. This 
means that all variables of the formula range over a finite 
known set of shapes. This allows us to define the predicate 
P given by (104) as a disjunction of cases, one case for ev- 
ery shape. Define hmm , ftmax functions that take a shape and 
produce a lower and upper bound for terms of that shape: 

hm]n(^C ) — ta 

/lmin(5'(il,*2)) = i9(/lmin(il):/»min(t2)) 

/imax(c') = tb 

/lmax(s'(tl,*2)) = i9(/lmax(tl),'*max(t2)) 

If Si, . . . , s„ is the list of shapes occurring in a formula, we 
then define a predicate P specific to that formula by 

n 

P{t) = y (/imm(Si) <tAt< /imax(t)) 

i=l 

We can therefore define P{t) and use it to translate the 
formula into a B.F formula of the same truth value. There- 
fore, structural subtyping with quantification bounded to 
constant shapes is decidable. 

For decidability of the structural subtyping recursive 
types it would be interesting to examine the decision proce- 
dure for MSOL and determine whether there is some unifor- 
mity in it that would allow us to handle even quantification 
over shapes that are determined by variables. 

7.5 Recursive Feature Trees 

We next remark that certain notion of subtyping of recursive 
feature trees is decidable. By a feature tree we mean an infi- 
nite tree built using a constructor which takes other feature 

trees and an optional node label as an argument. In this sec- 
tion we consider the simple case of one binary constructor / 
and assume only one label denoted by 1. Hence, an empty 
feature tree is a feature tree, and if ti and t2 arc feature trees 
then so are f''{ti,t2) and f^{ti,t2)- Wc represent an empty 
feature tree e by an infinite tree that has all features e. We 
compare feature trees as follows. Let < be defined on the 
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features {e, 1} as the relation {{e, e), (e, 1), (1, 1)}. Define < 
on trees as the least relation such that: 

1. e <t for all terms t; 

2. ti < t'l and t2 < t'2 implies 

r{h,h)<r{t[,t'2) 

for all ri,r2 € {e, 1} such that ri < r^- 

The decidability of feature trees follows from Section 7.1 
because of the isomorphism hp between the set of terms Te 
and the set of feature trees. Here hp is defined by: 

hpie) = te 

hF{r{U,t2)) = g{hF{ti),hF{t2),a) 
hpifitiM)) = g{hF{ti),hFit2),b) 

The feature trees as we defined them have a limited fea- 
ture and node label alphabet. This is not a fundamental 
problem. Muchnik's theorem [56] gives the decidability of 
MSOL of trees over arbitrary decidable structures. It is 
reasonable to expect that the decidability of MSOL over 
decidable structures yields a generalization of the result of 
Section 7.1 and therefore the decidability of feature trees 
with a richer vocabulary of features. 

The crucial property of our definition of feature trees is 
that features can appear in any node of the tree. Hence, 
there are no prefix closure requirements on trees as in Sec- 
tion 7.3, which is responsible for relatively simple reduction 
to MSOL. 

7.6 Reversed Binary Tree with Prefix-Closed Sets 

It is instructive to compare the difficulties our approach 

faces in showing the decidability of structural subtyping of 
recursive types with the difficulties reported in [48]. In [48, 
Section 5.3] the authors remark that the difficulty with ap- 
plying tree automata is that the set x = f(y,z) is not reg- 
ular. By reversing the set of paths in a tree representing 
a term we have shown in Section 7.1 that the relationship 
X = f{y, z) becomes expressible. However, the difficulty now 
becomes specifying a set of words that represents a valid 
term, because there is no immediate way of stating that a 
sot of words is prefix-closed. If we add an operation that 
allows expressing relationship at both "ends" of the words, 
we obtain a structure whose MSOL is undecidable due to 
the following result [52, Page 183]. 

Theorem 95 MSOL theory of the structure with two suc- 
cessor operations w ■ and w ■ 1 and one inverse successor 
operation ■ w is undecidable. 

The case that is of interest of us is the dual to Theorem 95 

under the word-reversing isomorphism: a structure with op- 
erations ■ w, 1 ■ w, w ■ has undecidable MSOL closed 
formulas. 

Instead of expressing prefix-closure using operations w-0, 
w • 1, let us consider MSOL over the structure that contains 
only operations • w and 1 ■ w, but where all second-order 
variables range over prefix-closed sets. This logic also turns 
out to be undecidable. 

Let PCI be the set of prefix-closed sets. For each word w, 
there exists the smallest PCI set containing w, namely the 
set C{w) given by: 

C{w) = {'u;' I w' -< w} 



Every subset of C(w) in PCI is a of the form C{wi) for some 
word wi. Define PSucco and PSucci on PCI by: 

PSucco(Xi,X2) = 3w. Xi = CH A X2 = C(0 • 

PSucci(Xi,X2) = 3w. Xi = C(w) A X2 = C(l • w) 

Consider a monadic theory PrefT with relations PSucco and 
PSucci where second-order variables range over the subsets 
of PCI. It is easy to see that PrefT corresponds to the first- 
order theory of non-structural subtyping of recursive types, 
with subset relation C corresponding to subtype relation <, 
empty set corresponding to the least type _L, PSucco(Xi, X2) 
corresponding to X2 ~ /(Xi,_L), and PSucci(Xi, X2) cor- 
responding to X2 = /(±,X2). The first-order theory of 
non-structural subtyping was shown undecidable in [48], so 
PrefT is undecidable. An interesting open problem is the de- 
cidability of fragments of the first-order theory of structural 
subtyping. This problem translates directly to the decid- 
ability of the fragments of PrefT, a monadic theory with 
prefix-closed sets, or, under the word-reversal isomorphism, 
the decidability of fragments of the monadic theory of two 
successor symbols with suffix-closed sets. 

8 Conclusion 

In this paper we presented a quantifier elimination proce- 
dure for the first-order theory of structural subtyping of 
non-recursive types. Our proof uses quantifier elimination. 
Our decidability proof for the first-order theory of structural 
subtyping clarifies the structure of the theory of structural 
subtyping by introducing explicitly the notion of shape of a 
term. 

We presented the proof in several stages with the hope of 
making the paper more accessible and self-contained. Our 
result on the decidability of E-term-powcr is more general 
than the decidability of structural subtyping non-recursive 
types, because wo allow even infinite decidable base struc- 
tures for primitive types. We view this decidability result 
as an interesting generalization of the decidability for term 
algebras and decidability of products of decidable theories. 
This generalization is potentially useful in theorem proving 
and program verification. 

Of potential interest might be the study of axiomatiz- 
ability properties; the quantifier elimination approach is ap- 
propriate for this purpose [31, 30], we did not pay much at- 
tention to this because we view the language and the mech- 
anism for specifying the axioms of secondary importance. 

Our goal in describing quantifier elimination procedure 
was to argue the decidability of the theory of structural sub- 
typing. While it should be relatively easy to extract an algo- 
rithm from our proofs, we did not give a formal description 
of the decision procedure. One possible formulation of the 
decision procedure would be a term-rewriting system such as 
[11]; this formulation is also appropriate for implementation 
within a theorem prover. Our approach eliminates quanti- 
fiers as opposed to quantifier alternations. For that purpose 
we extended the language with partial functions. The use of 
Kleene logic for partial functions seems to preserve most of 
the properties of two valued logic and appears to agree with 
the way partial functions are used in informal mathematical 
practice. An alternative direction for proving decidability 
of structural subtyping would be to use Ehrenfeucht-Fraisse 
games [53, Page 405); [15] uses techniques based on games 
to study both the decidability and the computational com- 
plexity of theories. 
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The complexity of our the decidability for structural sub- 
typing non-recursive types is non-elcrnentary and is a conse- 
quence of the non-elementary complexity of the term alge- 
bra, whose elements and operations are present in the theory 
of structural subtyping. Tools like MONA [25] show that 
non-elementary complexity does not necessarily make the 
implementation of a decision procedure uninteresting. An 
interesting property of quantifier elimination is that it can 
be applied partially to elimination an innermost quantifier 
from some formula. This property makes our decision pro- 
cedure applicable as part of an interactive theorem prover 
or a subroutine of a more general decision procedure. 

In this paper we have left open the decidability of struc- 
tural subtyping of recursive types, giving only a few remarks 
in Section 7. In particular we have observed in Section 7.1 
that every formula in the monadic second-order theory of the 
infinite binary tree [6, Page 317] has a corresponding formula 
in the first-order theory of structural subtyping of recursive 
types. In that sense, the decision problem for structural 
subtyping recursive types is at least as hard as the decision 
problem for the monadic second-order logic interpreted over 
the infinite binary tree. This observation is relevant for two 
reasons. 

First, it is unlikely that a minor modification of the quan- 
tifier elimination technique we used to show the decidabil- 
ity of structural subtyping non-recursive types can be used 
to show the decidability of recursive types. Because of the 

embedding in Section 7.1 such a quantifier-elimination proof 
would have to subsume the determinization of tree automata 
over infinite trees. 

Second, the embedding suggests even greater difficulties 
in implementing a decision procedure for the first-order the- 
ory of structural subtyping (provided that it exists). While 
we know at least one interesting example of weak monadic 
second-order logic decision procedure, namely [25] we are 
not aware of any implementation of the full monadic second- 
order logic decision procedure for the infinite tree. 

The relationship between the non-structural as well as 
structural subtyping and monadic second-order logic of the 
infinite binary tree and tree like structures [57] requires fur- 
ther study. In that respect the work on feature trees [36, 37] 
appears particularly relevant. 
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